@seanyao/roll 2026.528.1 → 2026.529.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -1,14 +1,39 @@
1
1
  # Changelog
2
2
 
3
- ## v2026.528.1
3
+ ## v2026.529.1
4
4
 
5
5
  ### Added
6
6
 
7
- - **`roll test`测试隔离运行,不再误伤本机 loop 服务** `[loop]`
7
+ - **loop 运行时间可以按项目设**不用再全局迁就
8
8
 
9
9
  ### Fixed
10
10
 
11
- - **Kimi CLI 升级改名为 kimi-code 后,roll 现在能正常识别** `[loop]`
11
+ - **`roll loop on` 不再显示全 00:00** 时间显示正常了
12
+ - **agent 检测不再误报** — 没装的工具不会被当成可用
13
+
14
+ ## v2026.528.2
15
+
16
+ ### Added
17
+
18
+ - **loop 换机器跑不会再拿过期 backlog** — 以前在 A 机器跑的 loop,搬到 B 机器继续时会用本地的旧 backlog,不知道有什么新待办;现在每轮开始前自动拉一次最新状态,多台机器始终看同一份 `[loop]`
19
+
20
+ - **CI 红了 loop 不再干等** — 主干测试挂掉时,loop 以前会停下等人去修;现在先自己分析失败原因、发一个修复,试满 3 次还没修好才发告警找人;自己开的 PR 被 CI 标红后也会自动续上去修,不会因为"本轮 loop 已经结束"就悄悄放弃 `[loop]`
21
+
22
+ - **`roll test` — 测试跑在独立环境里,不会再误伤本机** — 以前跑完整测试套件会触碰本机的 loop 调度服务,测试一过就把正在运行的 loop 打掉;现在测试在独立的 macOS VM 里跑,本机的 loop 完全不受影响 `[test]`
23
+
24
+ ### Fixed
25
+
26
+ - **Kimi CLI 改名后全链路都能识别了** — Kimi 把工具从 kimi-cli 改名为 kimi-code、安装目录也换了;Roll 现在新旧名字都能认出来,kimi-code 的安装路径也加进了自动查找范围,调度环境里也能找到,已设好的旧配置不需要动 `[agent]`
27
+
28
+ - **`roll loop log` 现在真的能看了** — 每轮 cycle 的日志存档修好了;以前文件根本没生成,现在用 `roll loop log` 能查到每一轮跑了什么 `[loop]`
29
+
30
+ - **loop 跑完的终端窗口不再瞬间消失** — 以前 cycle 结束 Terminal 窗口立刻关掉,来不及看本轮干了什么;现在窗口留着直到你自己关 `[loop]`
31
+
32
+ ### Improved
33
+
34
+ - **提交安全检查不会再被静默绕过** — Roll 的 TCR 要求每次提交前先过测试;以前在新终端或新机器上开工,这道检查会因为 git 配置没自动生效而悄悄失效,自动化环境尤其容易中招;现在每次打开 Claude Code 新会话、或跑 `roll setup` 都会自动配好,这个漏洞从源头堵死 `[infra]`
35
+
36
+ - **macOS 自带 bash 下中文命名的测试用例不再被静默跳过** — macOS 系统自带的旧版 bash 在处理中文或特殊字符测试名时会截断,导致这些测试根本没执行却也不报错,本地看起来全绿但实际上漏了一批用例;已修复,本地和 CI 结果现在一致 `[test]`
12
37
 
13
38
  ## v2026.527.1
14
39
 
package/README.md CHANGED
@@ -50,6 +50,7 @@ roll loop on # let AI work through the backlog (optional)
50
50
  | `roll status` | Show current state and drift |
51
51
  | `roll agent [use <name>]` | Per-project agent selection |
52
52
  | `roll ci [--wait]` | Show or wait for current commit's CI status |
53
+ | `roll test [--where] [--reset]` | Run the test suite (routes through isolation adapter; Tart VM on Apple Silicon) |
53
54
  | `roll release` | Run the release script (human-only) |
54
55
  | `roll review-pr <number>` | AI-powered code review for a PR |
55
56
  | **Machine · global** | |
package/bin/roll CHANGED
@@ -4,7 +4,7 @@ set -euo pipefail
4
4
  # Roll — AI Agent Convention Manager
5
5
  # Single source of truth for how all AI coding agents behave.
6
6
 
7
- VERSION="2026.528.1"
7
+ VERSION="2026.529.1"
8
8
  ROLL_HOME="${ROLL_HOME:-${HOME}/.roll}"
9
9
  ROLL_CONFIG="${ROLL_HOME}/config.yaml"
10
10
  ROLL_GLOBAL="${ROLL_HOME}/conventions/global"
@@ -83,31 +83,80 @@ lower_name() {
83
83
  }
84
84
 
85
85
 
86
- # Check if an AI tool is actually installed.
87
- # Most tools create their own config dir; Trae on macOS uses Library/Application Support
88
- # and expects roll to manage ~/.trae/ so we detect via the app directory instead.
89
- _is_ai_installed() {
90
- local ai_dir="$1"
91
- [[ -d "$ai_dir" ]] && return 0
92
- local bn
93
- bn="$(basename "$ai_dir" | sed 's/^\.//')"
94
- case "$bn" in
86
+ # FIX-128: agent binary-name(s) lookup. First binary found wins. Used by
87
+ # _agent_installed_by_name to enforce "CLI must exist on PATH" detection
88
+ # instead of the old "config dir exists" check (which Roll's own convention
89
+ # sync would fake — see FIX-128).
90
+ _agent_bin_names() {
91
+ case "$1" in
92
+ claude) echo "claude" ;;
93
+ codex|openai) echo "codex" ;; # openai is a Roll alias for codex
94
+ agy|gemini) echo "agy gemini" ;; # gemini reuses ~/.gemini for agy
95
+ kimi) echo "kimi-code kimi-cli kimi" ;; # FIX-126
96
+ deepseek) echo "deepseek" ;;
97
+ qwen) echo "qwen" ;;
98
+ pi) echo "pi" ;;
99
+ *) return 1 ;;
100
+ esac
101
+ }
102
+
103
+ # FIX-128: detect whether an AI agent (by canonical name) is actually
104
+ # usable on this machine. For CLI-only agents this means "binary on PATH";
105
+ # GUI / bundled-binary agents keep their special-case paths. Falls back
106
+ # to dir-existence only for unknown agents the operator has registered
107
+ # manually (forward-compatible with future additions).
108
+ _agent_installed_by_name() {
109
+ local agent="$1"
110
+ local dir="${2:-}"
111
+ case "$agent" in
95
112
  trae)
96
- [[ -d "$HOME/Library/Application Support/Trae" ]] ||
97
- [[ -d "$HOME/.config/Trae" ]]
113
+ [[ -d "$HOME/Library/Application Support/Trae" ]] || [[ -d "$HOME/.config/Trae" ]]
98
114
  return
99
115
  ;;
100
116
  opencode)
101
117
  [[ -x "$HOME/.opencode/bin/opencode" ]]
102
118
  return
103
119
  ;;
104
- agent)
105
- if [[ "$(basename "$(dirname "$ai_dir")")" == ".pi" ]]; then
106
- command -v pi &>/dev/null && return
107
- fi
120
+ cursor)
121
+ # cursor ships a GUI + an optional CLI; either path counts as "installed".
122
+ command -v cursor >/dev/null 2>&1 || [[ -d "$HOME/.cursor" ]]
123
+ return
124
+ ;;
125
+ openclaw)
126
+ [[ -d "$HOME/.openclaw/workspace" ]]
127
+ return
108
128
  ;;
109
129
  esac
110
- return 1
130
+ local bins
131
+ if bins=$(_agent_bin_names "$agent" 2>/dev/null); then
132
+ local b
133
+ for b in $bins; do
134
+ command -v "$b" >/dev/null 2>&1 && return 0
135
+ done
136
+ return 1
137
+ fi
138
+ # Unknown agent — fall back to dir presence so user-added entries still work.
139
+ [[ -n "$dir" && -d "$dir" ]]
140
+ }
141
+
142
+ # Check if an AI tool is actually installed (back-compat shim around
143
+ # _agent_installed_by_name; preserves the dir-path-based signature used
144
+ # throughout bin/roll).
145
+ _is_ai_installed() {
146
+ local ai_dir="$1"
147
+ local bn
148
+ bn="$(basename "$ai_dir" | sed 's/^\.//')"
149
+ # Nested-dir layouts collapse to their parent agent name.
150
+ case "$bn" in
151
+ agent|workspace)
152
+ bn="$(basename "$(dirname "$ai_dir")" | sed 's/^\.//')"
153
+ ;;
154
+ esac
155
+ # Mirror ai_tool_name's alias normalization so detection routes to the
156
+ # canonical agent record (e.g. ~/.gemini → agy, ~/.kimi-code → kimi).
157
+ [[ "$bn" == "gemini" ]] && bn="agy"
158
+ [[ "$bn" == "kimi-code" ]] && bn="kimi"
159
+ _agent_installed_by_name "$bn" "$ai_dir"
111
160
  }
112
161
 
113
162
  # ─── Spinner: TTY-aware status display for long-running steps (US-REL-003) ───
@@ -512,8 +561,7 @@ editor: ${EDITOR:-vim}
512
561
 
513
562
  # Loop schedule (24h format, machine local timezone)
514
563
  # Minute fields auto-derive from project path hash when omitted — avoids contention across projects.
515
- loop_active_start: 10 # loop only fires inside this window (after human reviews brief)
516
- loop_active_end: 18
564
+ # active_start/active_end moved to per-project .roll/local.yaml loop_schedule block (default 0/24).
517
565
  # loop_minute: 5 # omit to auto-derive from project hash
518
566
  loop_dream_hour: 3
519
567
  # loop_dream_minute: 10 # omit to auto-derive
@@ -522,6 +570,19 @@ loop_brief_hour: 9
522
570
  primary_agent: claude
523
571
  YAML
524
572
  ok "$(msg shared.created_roll_config_yaml)"
573
+
574
+ # FIX-128: the heredoc template hardcodes `primary_agent: claude` for
575
+ # the first-time case. Replace it with the first agent that actually
576
+ # has its CLI on PATH so users without Claude installed don't get a
577
+ # silently-broken default. If nothing detected, leave `claude` so the
578
+ # user still has a clear handle to fix manually.
579
+ local _detected_primary
580
+ _detected_primary="$(_first_installed_agent || true)"
581
+ if [[ -n "$_detected_primary" && "$_detected_primary" != "claude" ]]; then
582
+ _replace_primary_agent "$_detected_primary"
583
+ info "$(msg shared.primary_agent_auto_detected "$_detected_primary" 2>/dev/null \
584
+ || echo "primary_agent → $_detected_primary (auto-detected from installed CLIs)")"
585
+ fi
525
586
  fi
526
587
 
527
588
  # Ensure all expected ai_* keys exist (handles upgrades where new tools were added)
@@ -529,6 +590,32 @@ YAML
529
590
 
530
591
  }
531
592
 
593
+ # FIX-128: pick the first agent whose CLI is on PATH, scanning the same
594
+ # order the default config template lists them. Empty stdout when none
595
+ # detected; never errors.
596
+ _first_installed_agent() {
597
+ local agent
598
+ for agent in claude codex kimi deepseek qwen agy pi cursor opencode trae openclaw; do
599
+ if _agent_installed_by_name "$agent"; then
600
+ echo "$agent"
601
+ return 0
602
+ fi
603
+ done
604
+ return 1
605
+ }
606
+
607
+ # FIX-128: rewrite the `primary_agent:` line in $ROLL_CONFIG to the given
608
+ # value. Single-line in-place edit, preserves the rest of the file.
609
+ _replace_primary_agent() {
610
+ local new="$1"
611
+ [[ -f "$ROLL_CONFIG" && -n "$new" ]] || return 0
612
+ local tmp; tmp="$(mktemp)"
613
+ awk -v new="$new" '
614
+ /^primary_agent:/ { print "primary_agent: " new; next }
615
+ { print }
616
+ ' "$ROLL_CONFIG" > "$tmp" && mv "$tmp" "$ROLL_CONFIG"
617
+ }
618
+
532
619
  # ─── Internal: create or repair per-skill symlinks (non-destructive) ─────────
533
620
  _link_skills() {
534
621
  local force="${1:-false}"
@@ -539,7 +626,18 @@ _link_skills() {
539
626
  while IFS= read -r entry; do
540
627
  local ai_dir
541
628
  ai_dir="$(_ai_dir "$entry")"
542
- _is_ai_installed "$ai_dir" || continue
629
+ # FIX-128: detection is now binary-on-PATH, but skill linking keeps
630
+ # the same Claude-always-syncs semantics as _apply_conventions and
631
+ # tolerates pre-existing config dirs (an agent the user is mid-
632
+ # upgrade or installed via nvm/asdf still has its convention dir;
633
+ # we don't want to silently stop linking skills there). Strict
634
+ # binary detection drives chooser logic (primary_agent /
635
+ # _onboard_discover_agents) — see FIX-128.
636
+ if [[ "$ai_dir" != "$HOME/.claude" ]] \
637
+ && ! _is_ai_installed "$ai_dir" \
638
+ && [[ ! -d "$ai_dir" ]]; then
639
+ continue
640
+ fi
543
641
  mkdir -p "$ai_dir"
544
642
 
545
643
  local ai_name ai_dir_real skills_dir
@@ -639,8 +737,13 @@ _sync_convention_for_tool() {
639
737
  local dst_dir
640
738
  dst_dir="$(dirname "$main_dst")"
641
739
 
642
- # Only proceed if Claude (always) or the tool is installed
643
- if [[ "$dst_dir" != "$HOME/.claude" ]] && ! _is_ai_installed "$dst_dir"; then
740
+ # Only proceed if Claude (always), the tool is installed (binary-on-PATH
741
+ # per FIX-128), or the convention dir already exists (mid-upgrade /
742
+ # nvm-installed binaries that aren't on this shell's PATH still get
743
+ # their convention refresh).
744
+ if [[ "$dst_dir" != "$HOME/.claude" ]] \
745
+ && ! _is_ai_installed "$dst_dir" \
746
+ && [[ ! -d "$dst_dir" ]]; then
644
747
  return
645
748
  fi
646
749
  mkdir -p "$dst_dir"
@@ -747,6 +850,22 @@ _setup_snapshot() {
747
850
  # _ROLL_SETUP_STATE. Caller passes the watch dir(s) plus the command + args.
748
851
  # stdout/stderr of the inner command are suppressed (same as the previous
749
852
  # pattern in cmd_setup) to keep the v2 UI render the only user-visible output.
853
+ # US-INFRA-008: ensure core.hooksPath is set to 'hooks' so TCR pre-commit gate
854
+ # is never silently bypassed in new clones, worktrees, or automated environments.
855
+ # Idempotent: already set to a non-default value → leave it (user knows better).
856
+ # Not a git repo → silently skip.
857
+ _ensure_hooks_path() {
858
+ local repo_path="${1:-$PWD}"
859
+ # Must be a git repo
860
+ git -C "$repo_path" rev-parse --git-dir >/dev/null 2>&1 || return 0
861
+ local current; current=$(git -C "$repo_path" config core.hooksPath 2>/dev/null || echo "")
862
+ # Only set when unset or pointing at the git default (.git/hooks)
863
+ if [[ -z "$current" || "$current" == ".git/hooks" ]]; then
864
+ git -C "$repo_path" config core.hooksPath hooks 2>/dev/null || true
865
+ fi
866
+ return 0
867
+ }
868
+
750
869
  _run_setup_step() {
751
870
  local watch="$1"; shift
752
871
  local before after
@@ -803,6 +922,10 @@ cmd_setup() {
803
922
  _run_setup_step "$ROLL_HOME/.peer-state" _peer_ensure_state_dir
804
923
  _record "$(_state_to_marker "$_ROLL_SETUP_STATE")" "Initialize peer-review state directory"
805
924
 
925
+ # US-INFRA-008: configure git hooks path so TCR pre-commit gate works in this repo
926
+ _run_setup_step "$PWD" _ensure_hooks_path
927
+ _record "$(_state_to_marker "$_ROLL_SETUP_STATE")" "Configure git hooks path"
928
+
806
929
  if command -v tmux >/dev/null 2>&1; then
807
930
  _record skip "Ensure tmux is installed (already present)"
808
931
  else
@@ -907,10 +1030,46 @@ HINT
907
1030
  # prints install commands for the ones that aren't, so users who already opted
908
1031
  # in (or opted out) don't get spammed each upgrade.
909
1032
  cmd_doctor() {
1033
+ _doctor_agent_section
910
1034
  _doctor_pr_section
911
1035
  _doctor_launchd_stale_section
912
1036
  }
913
1037
 
1038
+ # FIX-128: list every ai_* entry from config, tag each with binary-on-PATH
1039
+ # status and config-dir existence so the user can see at a glance which
1040
+ # agents are actually usable vs only have Roll-maintained dirs.
1041
+ _doctor_agent_section() {
1042
+ [[ -f "$ROLL_CONFIG" ]] || return 0
1043
+ echo ""
1044
+ echo "$(ROLL_LANG_RESOLVED=en msg doctor.agent_detection)"
1045
+ echo "$(ROLL_LANG_RESOLVED=zh msg doctor.agent_detection)"
1046
+ echo ""
1047
+ local _key _value _name _dir _installed _dir_exists _is_primary
1048
+ _is_primary=$(grep -E '^primary_agent:' "$ROLL_CONFIG" 2>/dev/null | sed 's/^primary_agent: *//')
1049
+ while IFS=: read -r _key _value; do
1050
+ [[ "$_key" =~ ^ai_ ]] || continue
1051
+ _name="${_key#ai_}"
1052
+ [[ "$_name" == "kimi_code" ]] && continue # dedupe
1053
+ _dir="${_value%%|*}"
1054
+ _dir="${_dir# }"
1055
+ _dir="${_dir/#\~/$HOME}"
1056
+ if _agent_installed_by_name "$_name" "$_dir"; then
1057
+ _installed="$(msg doctor.agent_installed)"
1058
+ else
1059
+ _installed="$(msg doctor.agent_missing)"
1060
+ fi
1061
+ if [[ -d "$_dir" ]]; then
1062
+ _dir_exists="$(msg doctor.agent_dir_exists)"
1063
+ else
1064
+ _dir_exists="$(msg doctor.agent_dir_missing)"
1065
+ fi
1066
+ local _tag=""
1067
+ [[ "$_name" == "$_is_primary" ]] && _tag=" ($(msg doctor.agent_primary_label))"
1068
+ printf " %-10s %-14s %s%s\n" "$_name" "$_installed" "$_dir_exists" "$_tag"
1069
+ done < "$ROLL_CONFIG"
1070
+ return 0
1071
+ }
1072
+
914
1073
  # FIX-097: scan ${_LAUNCHD_DIR}/com.roll.*.plist for entries whose
915
1074
  # WorkingDirectory no longer exists on disk. These are the ghost agents left
916
1075
  # behind when a user manually reproduces a bug under /private/tmp/ or
@@ -1447,15 +1606,31 @@ _onboard_discover_agents() {
1447
1606
  while IFS=: read -r _key _value; do
1448
1607
  [[ "$_key" =~ ^ai_ ]] || continue
1449
1608
  _name="${_key#ai_}"
1609
+ # ai_kimi_code → kimi (avoid listing the same agent twice).
1610
+ [[ "$_name" == "kimi_code" ]] && _name="kimi"
1450
1611
  _dir="${_value%%|*}"
1451
1612
  _dir="${_dir# }"
1452
1613
  _dir="${_dir/#\~/$HOME}"
1453
- if [[ -d "$_dir" ]]; then
1454
- _ONBOARD_INSTALLED+=("$_name")
1614
+ # FIX-128: route via _agent_installed_by_name so "installed" means the
1615
+ # CLI is actually on PATH for known agents, not just the config dir
1616
+ # that Roll's own convention sync would have created.
1617
+ if _agent_installed_by_name "$_name" "$_dir"; then
1618
+ # Dedupe — kimi may appear under both ai_kimi and ai_kimi_code.
1619
+ # `${arr[@]+...}` keeps `set -u` happy when the array is still empty.
1620
+ local _already=0 _existing
1621
+ for _existing in ${_ONBOARD_INSTALLED[@]+"${_ONBOARD_INSTALLED[@]}"}; do
1622
+ if [[ "$_existing" == "$_name" ]]; then _already=1; break; fi
1623
+ done
1624
+ if [[ $_already -eq 0 ]]; then _ONBOARD_INSTALLED+=("$_name"); fi
1455
1625
  else
1456
- _ONBOARD_MISSING+=("$_name")
1626
+ local _already=0 _existing
1627
+ for _existing in ${_ONBOARD_MISSING[@]+"${_ONBOARD_MISSING[@]}"}; do
1628
+ if [[ "$_existing" == "$_name" ]]; then _already=1; break; fi
1629
+ done
1630
+ if [[ $_already -eq 0 ]]; then _ONBOARD_MISSING+=("$_name"); fi
1457
1631
  fi
1458
1632
  done < "$ROLL_CONFIG"
1633
+ return 0
1459
1634
  }
1460
1635
 
1461
1636
  # US-ONBOARD-018: pick an agent for the onboard flow.
@@ -2719,7 +2894,7 @@ _peer_dispatch_in_tmux() {
2719
2894
  {
2720
2895
  printf '#!/bin/bash -l\n'
2721
2896
  # FIX-050: portable PATH assembly (was hardcoded /opt/homebrew/bin)
2722
- printf 'for _d in /opt/homebrew/bin /usr/local/bin /opt/local/bin "$HOME/.local/bin"; do\n'
2897
+ printf 'for _d in /opt/homebrew/bin /usr/local/bin /opt/local/bin "$HOME/.local/bin" "$HOME/.kimi-code/bin"; do\n'
2723
2898
  printf ' case ":$PATH:" in *":$_d:"*) ;; *) [ -d "$_d" ] && PATH="$_d:$PATH" ;; esac\n'
2724
2899
  printf 'done; export PATH\n'
2725
2900
  printf '%s > %q 2> %q || true\n' "$cmd_str" "$out_file" "$err_file"
@@ -4604,13 +4779,16 @@ _isolation_tart_check_binary() {
4604
4779
  # returns 1 silently otherwise. Caller decides what to do.
4605
4780
  _isolation_tart_vm_present() {
4606
4781
  local name; name=$(_isolation_tart_vm_name)
4607
- tart list 2>/dev/null | awk -v n="$name" '$1 == n { found=1 } END { exit !found }'
4782
+ tart list 2>/dev/null | awk -v n="$name" '$2 == n { found=1 } END { exit !found }'
4608
4783
  }
4609
4784
 
4610
4785
  # Returns the VM's IP on stdout when reachable; exit non-zero when the VM
4611
4786
  # is stopped or `tart ip` fails for any other reason.
4612
4787
  _isolation_tart_ip() {
4613
4788
  local name; name=$(_isolation_tart_vm_name)
4789
+ # FIX: tart ip returns a stale DHCP-cached IP even for stopped VMs.
4790
+ # Gate on tart list State field before trusting the IP.
4791
+ tart list 2>/dev/null | awk -v n="$name" '$2 == n && $NF == "running" { found=1 } END { exit !found }' || return 1
4614
4792
  local ip; ip=$(tart ip "$name" 2>/dev/null) || return 1
4615
4793
  [[ "$ip" =~ ^[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+$ ]] || return 1
4616
4794
  printf '%s\n' "$ip"
@@ -4660,7 +4838,7 @@ _isolation_tart_provision() {
4660
4838
  local ip; ip=$(_isolation_tart_ip) || { err "tart provision: VM not running"; return 1; }
4661
4839
  local user; user=$(_isolation_tart_ssh_user)
4662
4840
  ssh -o BatchMode=yes -o StrictHostKeyChecking=no \
4663
- "${user}@${ip}" "brew list bats >/dev/null 2>&1 || brew install bats-core; \
4841
+ "${user}@${ip}" "export PATH=/opt/homebrew/bin:/usr/local/bin:\$PATH; brew list bats >/dev/null 2>&1 || brew install bats-core; \
4664
4842
  brew list node >/dev/null 2>&1 || brew install node; \
4665
4843
  brew list bash >/dev/null 2>&1 || brew install bash"
4666
4844
  }
@@ -4675,7 +4853,7 @@ _isolation_tart_exec() {
4675
4853
  if ! ip=$(_isolation_tart_ip); then
4676
4854
  # VM stopped — start it in the background with the repo mounted.
4677
4855
  local repo_root; repo_root="$(pwd -P)"
4678
- tart run --dir="roll:${repo_root}" "$name" >/dev/null 2>&1 &
4856
+ tart run --no-graphics --dir="roll:${repo_root}" "$name" >/dev/null 2>&1 &
4679
4857
  # Wait up to ~30s for IP to come up.
4680
4858
  local i=0
4681
4859
  while (( i < 30 )); do
@@ -4686,7 +4864,9 @@ _isolation_tart_exec() {
4686
4864
  [[ -n "${ip:-}" ]] || { err "tart exec: VM failed to start in 30s"; return 1; }
4687
4865
  fi
4688
4866
  local user; user=$(_isolation_tart_ssh_user)
4689
- ssh -o BatchMode=yes -o StrictHostKeyChecking=no "${user}@${ip}" "$@"
4867
+ local remote_cmd
4868
+ remote_cmd=$(printf '%q ' "$@")
4869
+ ssh -o BatchMode=yes -o StrictHostKeyChecking=no "${user}@${ip}" "export PATH=/opt/homebrew/bin:/usr/local/bin:\$PATH; cd '/Volumes/My Shared Files/roll' && $remote_cmd"
4690
4870
  }
4691
4871
 
4692
4872
  # reset: stop, delete, re-clone from base image, then re-provision.
@@ -4790,7 +4970,8 @@ Flags:
4790
4970
  --help, -h Show this help.
4791
4971
 
4792
4972
  Examples:
4793
- roll test Run the suite in whatever the config says.
4973
+ roll test Run affected tests (default: --affected HEAD~1).
4974
+ roll test -- tests/ Run the full suite explicitly.
4794
4975
  roll test -- --tier=fast Forward arguments to npm test.
4795
4976
  roll test --where Don't run; just report routing.
4796
4977
  roll test --reset Rebuild the VM (or host no-op).
@@ -4835,7 +5016,16 @@ EOF
4835
5016
  fi
4836
5017
 
4837
5018
  # Pass remaining args through to npm test inside the configured adapter.
4838
- _isolation_dispatch exec npm test "$@"
5019
+ # Default to --affected (HEAD~1 base) when the caller passes no extra args —
5020
+ # mirrors the pre-commit hook's intent and keeps VM runs fast.
5021
+ # To run the full suite explicitly: roll test -- tests/
5022
+ local _npm_args=("$@")
5023
+ if [[ "${#_npm_args[@]}" -eq 0 ]]; then
5024
+ _npm_args=(--affected)
5025
+ fi
5026
+ # Always pass args via `--` so npm doesn't intercept flags like --affected
5027
+ # as npm config options (npm warns and silently drops them otherwise).
5028
+ _isolation_dispatch exec npm test -- "${_npm_args[@]}"
4839
5029
  }
4840
5030
 
4841
5031
  # ═══════════════════════════════════════════════════════════════════════════════
@@ -5411,6 +5601,28 @@ _loop_schedule_spec() {
5411
5601
  echo "60 $offset"
5412
5602
  }
5413
5603
 
5604
+ # Read loop active window from .roll/local.yaml loop_schedule block.
5605
+ # Resolution order:
5606
+ # 1. .roll/local.yaml loop_schedule.{active_start,active_end}
5607
+ # 2. default 0 / 24 (full day)
5608
+ # Validation: both values must be integers 0–24, active_start < active_end.
5609
+ # Output: "<start> <end>" on stdout.
5610
+ _loop_read_active_window() {
5611
+ local project_path="${1:-$(pwd -P)}"
5612
+ local local_file="${project_path}/.roll/local.yaml"
5613
+ if [[ -f "$local_file" ]]; then
5614
+ local val_start val_end
5615
+ val_start=$(awk '/^loop_schedule:/{found=1;next} found && /^[[:space:]]+active_start:/{print $2; exit}' "$local_file")
5616
+ val_end=$(awk '/^loop_schedule:/{found=1;next} found && /^[[:space:]]+active_end:/{print $2; exit}' "$local_file")
5617
+ if [[ "$val_start" =~ ^[0-9]+$ && "$val_end" =~ ^[0-9]+$ ]] \
5618
+ && (( val_start < val_end && val_end <= 24 )); then
5619
+ echo "$val_start $val_end"
5620
+ return 0
5621
+ fi
5622
+ fi
5623
+ echo "0 24"
5624
+ }
5625
+
5414
5626
  # US-LOOP-032: human-readable schedule description.
5415
5627
  # Args: period offset [lang]
5416
5628
  # lang: en (default) or zh
@@ -5563,6 +5775,8 @@ _detect_path_prepend() {
5563
5775
  [[ -d /usr/local/bin ]] && dirs+=("/usr/local/bin")
5564
5776
  [[ -d /opt/local/bin ]] && dirs+=("/opt/local/bin")
5565
5777
  [[ -d "$HOME/.local/bin" ]] && dirs+=("$HOME/.local/bin")
5778
+ # FIX-129: kimi-code installs to ~/.kimi-code/bin (not brew/local), launchd misses it
5779
+ [[ -d "$HOME/.kimi-code/bin" ]] && dirs+=("$HOME/.kimi-code/bin")
5566
5780
  dirs+=("/usr/bin" "/bin" "/usr/sbin" "/sbin")
5567
5781
  for d in "${dirs[@]}"; do
5568
5782
  case ":$seen:" in *":$d:"*) continue ;; esac
@@ -5728,7 +5942,7 @@ set -o pipefail
5728
5942
  # FIX-050: portable PATH assembly — launchd/cron deliver a bare PATH that
5729
5943
  # misses brew-installed tools (tmux, claude, node, …). Iterate candidate
5730
5944
  # dirs; only prepend when present and not already in PATH. Idempotent.
5731
- for _d in /opt/homebrew/bin /usr/local/bin /opt/local/bin "\$HOME/.local/bin"; do
5945
+ for _d in /opt/homebrew/bin /usr/local/bin /opt/local/bin "\$HOME/.local/bin" "\$HOME/.kimi-code/bin"; do
5732
5946
  case ":\$PATH:" in *":\$_d:"*) ;; *) [ -d "\$_d" ] && PATH="\$_d:\$PATH" ;; esac
5733
5947
  done
5734
5948
  export PATH
@@ -6061,6 +6275,10 @@ _phase_begin startup
6061
6275
  _phase_end startup ok
6062
6276
  _phase_begin preflight
6063
6277
  cd "${project_path}" 2>/dev/null || true
6278
+ # US-INFRA-008: ensure git hooks are wired so TCR pre-commit gate can't be bypassed
6279
+ _ensure_hooks_path "${project_path}" 2>/dev/null || true
6280
+ # US-LOOP-056: sync .roll/ meta from roll-meta remote before backlog scan
6281
+ _loop_sync_meta "${project_path}" || true
6064
6282
  # FIX-104: GC stale merged temp branches at cycle entry — before worktree setup
6065
6283
  # and before any early-exit gate (pre-run abort, CI red precheck). The post-claude
6066
6284
  # call site doesn't cover those paths, so merged branches accumulated on origin.
@@ -6372,7 +6590,7 @@ INNER
6372
6590
  # FIX-050: portable PATH assembly before any brew-tool lookup (tmux, caffeinate
6373
6591
  # on some systems, claude). Mirrors the inner script's bootstrap so even when
6374
6592
  # launchd's plist EnvironmentVariables is stale, the runner self-repairs.
6375
- for _d in /opt/homebrew/bin /usr/local/bin /opt/local/bin "\$HOME/.local/bin"; do
6593
+ for _d in /opt/homebrew/bin /usr/local/bin /opt/local/bin "\$HOME/.local/bin" "\$HOME/.kimi-code/bin"; do
6376
6594
  case ":\$PATH:" in *":\$_d:"*) ;; *) [ -d "\$_d" ] && PATH="\$_d:\$PATH" ;; esac
6377
6595
  done
6378
6596
  export PATH
@@ -6459,10 +6677,24 @@ if command -v tmux >/dev/null 2>&1; then
6459
6677
  tmux list-sessions -F "#{session_name}" 2>/dev/null | grep "^roll-loop-${slug}\$" | while read _s; do
6460
6678
  tmux kill-session -t "\$_s" 2>/dev/null || true
6461
6679
  done
6462
- tmux new-session -d -s "\$SESSION" -x 200 -y 50 "bash \"\$INNER_SCRIPT\""
6463
- CYCLE_LOG_RAW="${project_path}/.roll/cycle-logs/.pipe-\$\$.raw"
6680
+ # FIX-132: syntax-check the inner script before spawning the tmux session.
6681
+ # A heredoc quoting regression or mid-cycle regeneration can silently produce
6682
+ # a syntactically broken script; catching it here prevents the session from
6683
+ # starting in a corrupted state and logging a misleading "exited 0, retrying".
6684
+ if ! bash -n "\$INNER_SCRIPT" 2>>"\$LOG"; then
6685
+ echo "[\$(date '+%Y-%m-%dT%H:%M:%S%z')] ABORT: inner script failed syntax check — cycle skipped (see log: \$LOG)" >> "\$LOG"
6686
+ exit 1
6687
+ fi
6688
+ # FIX-130: export ROLL_CYCLE_LOG_RAW BEFORE spawning the tmux session so
6689
+ # the inner script inherits it (env vars are inherited at spawn time, not
6690
+ # retroactively — exporting after new-session means inner never sees it and
6691
+ # _inner_cleanup skips log archiving, leaving only orphan .pipe-*.raw files).
6464
6692
  mkdir -p "${project_path}/.roll/cycle-logs"
6693
+ # Clean orphan .pipe-*.raw files from previous crashed cycles
6694
+ find "${project_path}/.roll/cycle-logs" -name '.pipe-*.raw' -delete 2>/dev/null || true
6695
+ CYCLE_LOG_RAW="${project_path}/.roll/cycle-logs/.pipe-\$\$.raw"
6465
6696
  export ROLL_CYCLE_LOG_RAW="\$CYCLE_LOG_RAW"
6697
+ tmux new-session -d -s "\$SESSION" -x 200 -y 50 "bash \"\$INNER_SCRIPT\""
6466
6698
  tmux pipe-pane -t "\$SESSION" "tee -a \"\$LOG\" >> \"\$ROLL_CYCLE_LOG_RAW\""
6467
6699
  # Auto-attach popup: when not muted, spawn a Terminal.app window attached
6468
6700
  # to the tmux session so the user can watch the loop work in real time.
@@ -6481,7 +6713,9 @@ if command -v tmux >/dev/null 2>&1; then
6481
6713
  # window closes the instant the tmux session ends (cycle_end kills
6482
6714
  # the session) and the entire scrollback disappears with it; the
6483
6715
  # cron-<slug>.log file still has the full transcript as a fallback.
6484
- printf '#!/bin/bash\\ntmux attach -t %s\\necho\\necho "================================================================"\\necho " cycle ended. log: ~/.shared/roll/loop/cron-%s.log"\\necho " press enter to close this window."\\necho "================================================================"\\nread _\\n' \\
6716
+ # FIX-131: after tmux session ends, open the cron log with less so the
6717
+ # user can scroll through the full cycle output instead of seeing nothing.
6718
+ printf '#!/bin/bash\\ntmux attach -t %s 2>/dev/null\\nLOGFILE=~/.shared/roll/loop/cron-%s.log\\necho\\nif [ -f "\$LOGFILE" ]; then\\n echo "================================================================"\\n echo " Cycle ended — showing log (arrows to scroll, q to close)"\\n echo "================================================================"\\n less -R +G "\$LOGFILE"\\nelse\\n echo "================================================================"\\n echo " Cycle ended. Log not found: \$LOGFILE"\\n echo " press enter to close."\\n echo "================================================================"\\n read _\\nfi\\n' \\
6485
6719
  "\$SESSION" "${slug}" > "\$_attach_cmd" 2>/dev/null || true
6486
6720
  chmod +x "\$_attach_cmd" 2>/dev/null || true
6487
6721
  open -g -a Terminal "\$_attach_cmd" >/dev/null 2>&1 || true
@@ -6583,8 +6817,8 @@ _install_launchd_plists() {
6583
6817
  mkdir -p "${shared}/loop" "${shared}/dream" "${shared}/brief"
6584
6818
 
6585
6819
  local active_start active_end dream_hour dream_minute brief_hour brief_minute loop_period loop_offset
6586
- active_start=$(_config_read_int "loop_active_start" "10")
6587
- active_end=$(_config_read_int "loop_active_end" "18")
6820
+ local _aw; _aw=$(_loop_read_active_window "$project_path")
6821
+ active_start="${_aw%% *}"; active_end="${_aw##* }"
6588
6822
  # US-LOOP-012: use _loop_schedule_spec instead of raw loop_minute
6589
6823
  local loop_spec; loop_spec=$(_loop_schedule_spec "$project_path")
6590
6824
  loop_period="${loop_spec%% *}"
@@ -6710,10 +6944,11 @@ cmd_loop() {
6710
6944
  resume) _loop_resume ;;
6711
6945
  reset) _loop_reset ;;
6712
6946
  gc) shift; _loop_gc "$@" ;;
6713
- notify) _notify "${1:-roll}" "${2:-}" ;;
6714
- enforce-tcr) _loop_enforce_tcr "${1:-}" "${2:-}" ;;
6715
- precheck-ci) _loop_precheck_ci ;;
6716
- branches) _loop_branches "$(pwd -P)" ;;
6947
+ notify) _notify "${1:-roll}" "${2:-}" ;;
6948
+ enforce-tcr) _loop_enforce_tcr "${1:-}" "${2:-}" ;;
6949
+ precheck-ci) _loop_precheck_ci ;;
6950
+ hotfix-head-context) _loop_hotfix_head_context "${1:-}" ;;
6951
+ branches) _loop_branches "$(pwd -P)" ;;
6717
6952
  *) cat <<'HELP'
6718
6953
  Usage: roll loop <on|off|now|test|status|monitor|runs|log|story|events|attach|mute|unmute|pause|resume|reset|gc|branches>
6719
6954
 
@@ -6759,8 +6994,8 @@ _loop_on() {
6759
6994
  local agent; agent=$(_project_agent)
6760
6995
 
6761
6996
  local active_start active_end loop_minute dream_hour dream_minute brief_hour brief_minute
6762
- active_start=$(_config_read_int "loop_active_start" "10")
6763
- active_end=$(_config_read_int "loop_active_end" "18")
6997
+ local _aw; _aw=$(_loop_read_active_window "$project_path")
6998
+ active_start="${_aw%% *}"; active_end="${_aw##* }"
6764
6999
  # US-LOOP-011: read schedule spec from project or global config
6765
7000
  local loop_spec loop_period loop_offset
6766
7001
  loop_spec=$(_loop_schedule_spec "$project_path")
@@ -6806,10 +7041,10 @@ _loop_on() {
6806
7041
  fi
6807
7042
 
6808
7043
  ok "$(msg loop.loop_enabled)"
6809
- printf "$(msg loop.roll_loop_s_active_02d_00)" \
7044
+ msg loop.roll_loop_s_active_02d_00 \
6810
7045
  "$loop_sched_en" "$active_start" "$active_end" "$loop_sched_zh" "$active_start" "$active_end"
6811
- printf "$(msg loop.roll_dream_daily_at_02d_02d)" "$dream_hour" "$dream_minute" "$dream_hour" "$dream_minute"
6812
- printf "$(msg loop.roll_brief_daily_at_02d_02d)" "$brief_hour" "$brief_minute" "$brief_hour" "$brief_minute"
7046
+ msg loop.roll_dream_daily_at_02d_02d "$dream_hour" "$dream_minute" "$dream_hour" "$dream_minute"
7047
+ msg loop.roll_brief_daily_at_02d_02d "$brief_hour" "$brief_minute" "$brief_hour" "$brief_minute"
6813
7048
  echo " • Agent: ${agent} (change: roll agent use <name>)"
6814
7049
  return 0
6815
7050
  fi
@@ -6837,10 +7072,10 @@ _loop_on() {
6837
7072
  ) | crontab -
6838
7073
 
6839
7074
  ok "$(msg loop.loop_enabled_2)"
6840
- printf "$(msg loop.roll_loop_s_active_02d_00_2)" \
7075
+ msg loop.roll_loop_s_active_02d_00_2 \
6841
7076
  "$loop_sched_en" "$active_start" "$active_end" "$loop_sched_zh" "$active_start" "$active_end"
6842
- printf "$(msg loop.roll_dream_daily_at_02d_02d_2)" "$dream_hour" "$dream_minute" "$dream_hour" "$dream_minute"
6843
- printf "$(msg loop.roll_brief_daily_at_02d_02d_2)" "$brief_hour" "$brief_minute" "$brief_hour" "$brief_minute"
7077
+ msg loop.roll_dream_daily_at_02d_02d_2 "$dream_hour" "$dream_minute" "$dream_hour" "$dream_minute"
7078
+ msg loop.roll_brief_daily_at_02d_02d_2 "$brief_hour" "$brief_minute" "$brief_hour" "$brief_minute"
6844
7079
  echo " • Agent: ${agent} (change: roll agent use <name>)"
6845
7080
  }
6846
7081
 
@@ -6980,8 +7215,8 @@ _loop_test() {
6980
7215
 
6981
7216
  # FIX-054: terminal preference removed — runner always uses Terminal.app.
6982
7217
  local active_start active_end
6983
- active_start=$(_config_read_int "loop_active_start" "10")
6984
- active_end=$(_config_read_int "loop_active_end" "18")
7218
+ local _aw; _aw=$(_loop_read_active_window "$project_path")
7219
+ active_start="${_aw%% *}"; active_end="${_aw##* }"
6985
7220
 
6986
7221
  info "$(msg loop.generating_test_runner_agent ${agent})"
6987
7222
  _write_loop_runner_script "$test_runner" "$project_path" \
@@ -7741,6 +7976,66 @@ _ci_wait() {
7741
7976
  }
7742
7977
 
7743
7978
  # Pre-run CI health check — call before picking up new stories.
7979
+ # US-LOOP-056: sync .roll/ (roll-meta private submodule) before each cycle so
7980
+ # the cycle always runs against the latest backlog. Fail-soft: any error emits
7981
+ # a meta_sync event and returns 0 so the cycle continues with stale/existing meta.
7982
+ #
7983
+ # Statuses emitted via _loop_event meta_sync:
7984
+ # ok – fetch + reset --hard succeeded
7985
+ # stale – fetch failed; existing .roll/ used as fallback
7986
+ # skipped – no git remote configured (not a roll-meta managed project)
7987
+ #
7988
+ # Env override: ROLL_LOOP_META_SYNC_TIMEOUT (default 15) controls fetch timeout.
7989
+ _loop_sync_meta() {
7990
+ local project_path="$1"
7991
+ local roll_meta="${project_path}/.roll"
7992
+ local timeout_sec="${ROLL_LOOP_META_SYNC_TIMEOUT:-15}"
7993
+ local cid="${CYCLE_ID:-unknown}"
7994
+ local slug="${_LOOP_PROJ_SLUG:-$(_project_slug 2>/dev/null || echo unknown)}"
7995
+ local shared_dir="${_SHARED_ROOT:-$HOME/.shared/roll}/loop"
7996
+ local fail_counter="${shared_dir}/meta-sync-fail-${slug}"
7997
+
7998
+ # Detect remote via the canonical probe point. If .roll/ has no .git or no
7999
+ # remote configured, treat as "not managed" and skip silently.
8000
+ local remote_url
8001
+ remote_url=$(git -C "$roll_meta" remote get-url origin 2>/dev/null || echo "")
8002
+ if [ -z "$remote_url" ]; then
8003
+ return 0
8004
+ fi
8005
+
8006
+ # Attempt fetch with timeout
8007
+ local _fetch_ok=0
8008
+ if command -v timeout >/dev/null 2>&1; then
8009
+ timeout "$timeout_sec" git -C "$roll_meta" fetch --quiet 2>/dev/null && _fetch_ok=1
8010
+ else
8011
+ git -C "$roll_meta" fetch --quiet 2>/dev/null && _fetch_ok=1
8012
+ fi
8013
+
8014
+ if [ "$_fetch_ok" -eq 1 ]; then
8015
+ if git -C "$roll_meta" reset --hard origin/main --quiet 2>/dev/null; then
8016
+ _loop_event meta_sync "$cid" "ok" "" 2>/dev/null || true
8017
+ # US-LOOP-057: reset consecutive failure counter on success
8018
+ rm -f "$fail_counter" 2>/dev/null || true
8019
+ return 0
8020
+ fi
8021
+ fi
8022
+
8023
+ # Fetch or reset failed — stale .roll/ used; cycle continues
8024
+ _loop_event meta_sync "$cid" "stale" "fetch/reset failed" 2>/dev/null || true
8025
+
8026
+ # US-LOOP-057: increment failure counter; write ALERT after 3 consecutive failures
8027
+ mkdir -p "$shared_dir" 2>/dev/null || true
8028
+ local count=0
8029
+ [ -f "$fail_counter" ] && count=$(cat "$fail_counter" 2>/dev/null || echo 0)
8030
+ count=$(( count + 1 ))
8031
+ printf '%s\n' "$count" > "$fail_counter"
8032
+ if [ "$count" -ge 3 ]; then
8033
+ printf '[%s] roll-meta sync consecutive failures: %d times. Check SSH key / network.\n Last error: fetch/reset failed for %s\n' \
8034
+ "$(date -u +%Y-%m-%dT%H:%M:%SZ)" "$count" "$remote_url" >> "${shared_dir}/ALERT-${slug}.md" 2>/dev/null || true
8035
+ fi
8036
+ return 0
8037
+ }
8038
+
7744
8039
  # Refuses to build on a red base (HEAD CI failed). Lenient on unknown states
7745
8040
  # (gh missing, repo unparseable, no runs yet) — the post-build _loop_enforce_ci
7746
8041
  # is the strict gate.
@@ -7773,6 +8068,38 @@ _loop_precheck_ci() {
7773
8068
  run_states=$(echo "$runs" \
7774
8069
  | jq -r '[.[] | "\(.status // "?")/\(.conclusion // "null")"] | unique | join(", ")' \
7775
8070
  2>/dev/null || echo "?")
8071
+
8072
+ # US-LOOP-046/048: check whether hot-fix path is allowed before aborting.
8073
+ # ROLL_LOOP_NO_HEAL=1 or ROLL_LOOP_HEAL_MAX=0 → fall through to original abort.
8074
+ local _heal_max="${ROLL_LOOP_HEAL_MAX:-2}"
8075
+ if [[ "${ROLL_LOOP_NO_HEAL:-}" != "1" ]] && [[ "$_heal_max" -gt 0 ]]; then
8076
+ local _state_file="${_SHARED_ROOT:-$HOME/.shared/roll}/loop/state-${_LOOP_PROJ_SLUG:-$(basename "$PWD")}.yaml"
8077
+ local _heal_key="heal_count_head_${commit:0:8}"
8078
+ local _count=0
8079
+ [[ -f "$_state_file" ]] && _count=$(grep "^${_heal_key}:" "$_state_file" 2>/dev/null | awk '{print $2}' || echo 0)
8080
+ _count=$(( ${_count:-0} + 0 )) # coerce to int
8081
+ if [[ "$_count" -lt "$_heal_max" ]]; then
8082
+ # Increment counter and signal hot-fix path to the agent
8083
+ _count=$(( _count + 1 ))
8084
+ mkdir -p "$(dirname "$_state_file")" 2>/dev/null || true
8085
+ if [[ -f "$_state_file" ]]; then
8086
+ # Update existing key or append
8087
+ if grep -q "^${_heal_key}:" "$_state_file" 2>/dev/null; then
8088
+ local _tmp; _tmp=$(mktemp)
8089
+ grep -v "^${_heal_key}:" "$_state_file" > "$_tmp" 2>/dev/null || true
8090
+ printf '%s: %d\n' "$_heal_key" "$_count" >> "$_tmp"
8091
+ mv "$_tmp" "$_state_file"
8092
+ else
8093
+ printf '%s: %d\n' "$_heal_key" "$_count" >> "$_state_file"
8094
+ fi
8095
+ else
8096
+ printf '%s: %d\n' "$_heal_key" "$_count" > "$_state_file"
8097
+ fi
8098
+ # Exit 2 signals the agent: CI is red, hot-fix path is available
8099
+ return 2
8100
+ fi
8101
+ fi
8102
+
7776
8103
  err "$(msg loop.pre_run_ci_check_head_ci ${short})"
7777
8104
  mkdir -p "$(dirname "$_LOOP_ALERT")"
7778
8105
  cat > "$_LOOP_ALERT" << EOF
@@ -7795,6 +8122,62 @@ EOF
7795
8122
  return 0
7796
8123
  }
7797
8124
 
8125
+ # US-LOOP-047: hot-fix context factory for HEAD CI failures.
8126
+ # Captures failing run logs + recent commit diff, writes to /tmp/roll-heal-head-<sha>.log
8127
+ # Returns 0 and prints the log path on success; 1 if context could not be gathered.
8128
+ _loop_hotfix_head_context() {
8129
+ local commit="${1:-$(git rev-parse HEAD 2>/dev/null)}"
8130
+ [[ -z "$commit" ]] && return 1
8131
+ local short="${commit:0:8}"
8132
+ local outfile="/tmp/roll-heal-head-${short}.log"
8133
+ local slug; _gh_resolve slug || slug="unknown"
8134
+
8135
+ {
8136
+ printf '=== CI Hot-fix Context: HEAD %s ===\n\n' "$short"
8137
+ printf '--- Recent commits ---\n'
8138
+ git log --oneline -5 2>/dev/null || true
8139
+ printf '\n--- Diff of last commit ---\n'
8140
+ git show --stat HEAD 2>/dev/null | head -40 || true
8141
+ printf '\n--- CI failure logs (head 200 lines) ---\n'
8142
+ local run_id
8143
+ run_id=$(gh -R "$slug" run list --commit "$commit" \
8144
+ --json databaseId,conclusion -L 5 2>/dev/null \
8145
+ | jq -r '.[] | select(.conclusion=="failure") | .databaseId' 2>/dev/null | head -1)
8146
+ if [[ -n "$run_id" ]]; then
8147
+ gh -R "$slug" run view --log-failed "$run_id" 2>/dev/null | head -200 || true
8148
+ else
8149
+ printf '(no failed run found for commit %s)\n' "$short"
8150
+ fi
8151
+ } > "$outfile" 2>/dev/null
8152
+ printf '%s\n' "$outfile"
8153
+ return 0
8154
+ }
8155
+
8156
+ # US-LOOP-050: PR hot-fix entry point.
8157
+ # Checks out the PR branch, captures CI failure logs, and prepares context
8158
+ # for a roll-fix invocation on the PR branch.
8159
+ # Usage: _loop_hot_fix_pr <pr_number>
8160
+ _loop_hot_fix_pr() {
8161
+ local pr_num="$1"
8162
+ [[ -z "$pr_num" ]] && return 1
8163
+ local slug; _gh_resolve slug || return 1
8164
+ local outfile="/tmp/roll-heal-pr-${pr_num}.log"
8165
+ local run_id
8166
+ run_id=$(gh -R "$slug" run list --json databaseId,conclusion,headBranch -L 20 2>/dev/null \
8167
+ | jq -r --argjson pr "\"$pr_num\"" \
8168
+ '.[] | select(.conclusion=="failure") | .databaseId' 2>/dev/null | head -1)
8169
+ {
8170
+ printf '=== PR #%s CI Hot-fix Context ===\n\n' "$pr_num"
8171
+ if [[ -n "$run_id" ]]; then
8172
+ gh -R "$slug" run view --log-failed "$run_id" 2>/dev/null | head -200 || true
8173
+ else
8174
+ printf '(no failed run found for PR #%s)\n' "$pr_num"
8175
+ fi
8176
+ } > "$outfile" 2>/dev/null
8177
+ printf '%s\n' "$outfile"
8178
+ return 0
8179
+ }
8180
+
7798
8181
  # _loop_diagnose_open_prs <slug>
7799
8182
  # Appended to ALERT when CI is red on HEAD.
7800
8183
  # For each open PR targeting main: lists CI failing tests + changed files,
@@ -8029,7 +8412,14 @@ _loop_pr_classify() {
8029
8412
  local mergeable="${4:-}"
8030
8413
 
8031
8414
  case "$head_ref" in
8032
- loop/*) echo "loop_self"; return 0 ;;
8415
+ loop/*)
8416
+ # US-LOOP-049: loop/* PRs with CI failure get their own classification
8417
+ # so _loop_pr_inbox can route them to the PR hot-fix path.
8418
+ if [[ "$ci_state" == "failure" ]]; then
8419
+ echo "loop_self_ci_red"; return 0
8420
+ fi
8421
+ echo "loop_self"; return 0
8422
+ ;;
8033
8423
  esac
8034
8424
 
8035
8425
  case "$human_review" in
@@ -9109,8 +9499,8 @@ _loop_monitor() {
9109
9499
  echo -e "$(msg loop.services ${BOLD} ${NC} ${CYAN} ${agent})"
9110
9500
  if [[ "$(uname)" == "Darwin" ]]; then
9111
9501
  local active_start active_end dream_hour dream_minute brief_hour brief_minute
9112
- active_start=$(_config_read_int "loop_active_start" "10")
9113
- active_end=$(_config_read_int "loop_active_end" "18")
9502
+ local _aw; _aw=$(_loop_read_active_window "$project_path")
9503
+ active_start="${_aw%% *}"; active_end="${_aw##* }"
9114
9504
  # US-LOOP-013: use schedule spec for display
9115
9505
  local loop_spec loop_period loop_offset
9116
9506
  loop_spec=$(_loop_schedule_spec "$project_path")
@@ -9980,8 +10370,8 @@ _legacy_home() {
9980
10370
  crontab -l 2>/dev/null | grep -q "${_LOOP_TAG}:${project_path}" && loop_state="enabled"
9981
10371
  fi
9982
10372
  local active_start active_end dream_hour dream_minute brief_hour brief_minute
9983
- active_start=$(_config_read_int "loop_active_start" "10")
9984
- active_end=$(_config_read_int "loop_active_end" "18")
10373
+ local _aw; _aw=$(_loop_read_active_window "$project_path")
10374
+ active_start="${_aw%% *}"; active_end="${_aw##* }"
9985
10375
  # US-LOOP-013: use schedule spec for display
9986
10376
  local loop_spec loop_period loop_offset
9987
10377
  loop_spec=$(_loop_schedule_spec "$project_path")
package/lib/README.md ADDED
@@ -0,0 +1,42 @@
1
+ > **Draft** — auto-generated by roll-doc on 2026-05-28. Review before treating as authoritative.
2
+
3
+ # lib/ — Python helpers and i18n runtime
4
+
5
+ Python scripts and shell libraries that `bin/roll` delegates to for rendering-heavy or data-processing tasks.
6
+
7
+ ## Key files
8
+
9
+ | File | Purpose |
10
+ |------|---------|
11
+ | `roll-loop-status.py` | Renders the `roll loop status` health dashboard — reads cycle event NDJSON, computes per-cycle rows, daily rollups, and phase-tracing breakdown |
12
+ | `roll-loop-story.py` | Per-story rollup: aggregates cycles, tokens, cost, and PR outcomes for `roll loop story <ID>` |
13
+ | `roll-status.py` | Renders the `roll status` one-screen sync health view |
14
+ | `roll-init.py` | Init-flow helpers called by `roll init` |
15
+ | `roll-setup.py` | Setup-flow helpers (convention sync, tool config write) |
16
+ | `roll-brief.py` | Brief generation: reads cycle records and produces the feature brief |
17
+ | `roll-backlog.py` | Backlog read/write helpers |
18
+ | `roll-peer.py` | Peer review coordination helpers |
19
+ | `roll-help.py` | Renders `roll --help` output |
20
+ | `roll-plan-validate.py` | Validates plan files before story execution |
21
+ | `model_prices.py` | List-price table for AI model API pricing (per MTok, native currency) |
22
+ | `prices_fetcher.py` | Fetches fresh price snapshots from vendor APIs |
23
+ | `roll_render.py` | Shared rendering utilities (tables, color, formatting) |
24
+ | `loop-fmt.py` | Loop log formatter (ANSI-strip, timestamp alignment) |
25
+ | `loop_unstick.py` | Diagnostic: detects and unsticks hung loop state |
26
+ | `backfill-pi-usage.py` | Backfills pi/deepseek token and cost data into existing cycle records |
27
+ | `changelog_audit.py` | Audits CHANGELOG.md against backlog entries |
28
+ | `i18n.sh` | Shell wrapper that delegates i18n string lookups to `lib/i18n/` |
29
+ | `slides-render.py` | Renders `.deck.md` → HTML slides |
30
+ | `slides-validate.py` | Validates deck file syntax and asset references |
31
+
32
+ ## Sub-directories
33
+
34
+ - `agent_usage/` — token-usage capture and cost attribution per agent invocation
35
+ - `i18n/` — localized string tables for all CLI output (EN + ZH)
36
+ - `prices/` — price snapshot JSON files (per-vendor, dated)
37
+ - `slides/` — slide component library for `roll deck`
38
+
39
+ ## Dependencies
40
+
41
+ Imported by `bin/roll` via subprocess calls (`python3 lib/<script>.py`).
42
+ No third-party pip dependencies — standard library only (json, sys, os, re, datetime).
@@ -0,0 +1,54 @@
1
+ > **Draft** — auto-generated by roll-doc on 2026-05-28. Review before treating as authoritative.
2
+
3
+ # lib/i18n/ — Localized string tables
4
+
5
+ All user-visible CLI output strings for both `en` and `zh` locales, organized by command domain.
6
+
7
+ ## Structure
8
+
9
+ Each `.sh` file under `lib/i18n/` is a shell associative-array fragment exporting a `MSG_*` namespace:
10
+
11
+ ```
12
+ lib/i18n/
13
+ ├── agent.sh # roll agent use / install messages
14
+ ├── alert.sh # ALERT lifecycle messages
15
+ ├── backlog.sh # backlog read/write output
16
+ ├── brief.sh # roll-brief generation output
17
+ ├── changelog.sh # changelog sync messages
18
+ ├── ci.sh # CI self-heal messages
19
+ ├── debug.sh # roll debug diagnostics
20
+ ├── doctor.sh # roll-doctor check output
21
+ ├── dream.sh # roll-.dream scan output
22
+ ├── init.sh # roll init setup messages
23
+ ├── lang.sh # locale detection + ROLL_LANG resolution
24
+ ├── loop.sh # roll loop subcommand output (largest file)
25
+ ├── migrate.sh # roll migrate messages
26
+ ├── offboard.sh # roll offboard output
27
+ ├── onboard.sh # roll onboard / legacy-onboard output
28
+ ├── peer.sh # roll peer review messages
29
+ ├── peer_help.sh # peer --help text
30
+ ├── peer_reset.sh # peer reset confirmation messages
31
+ ├── peer_status.sh # peer status output
32
+ ├── prices_refresh.sh # prices refresh output
33
+ └── skills/ # per-skill i18n overrides
34
+ ```
35
+
36
+ ## Locale selection
37
+
38
+ `ROLL_LANG` env var controls which locale is active. Resolved by `lang.sh`:
39
+
40
+ 1. `ROLL_LANG` explicit → use it
41
+ 2. `LC_ALL` / `LANG` contains `zh` → `zh`
42
+ 3. Default → `en`
43
+
44
+ Each `.sh` file branches on `ROLL_LANG` and exports the appropriate string set.
45
+
46
+ ## skills/
47
+
48
+ Per-skill message overrides for `roll-build`, `roll-design`, `roll-fix`, `roll-loop`, `roll-onboard`. Same structure as top-level files — sourced after the base file to allow skill-specific overrides without editing shared strings.
49
+
50
+ ## Adding a new string
51
+
52
+ 1. Add the key to both `en` and `zh` branches in the appropriate domain file.
53
+ 2. Reference via `msg <KEY>` in `bin/roll` or the relevant skill.
54
+ 3. Never hardcode user-facing strings in `bin/roll` directly — always go through i18n.
@@ -29,3 +29,16 @@ _i18n_set en doctor.pr_event_without_zh "contributors get AI feedback on PR open
29
29
  _i18n_set zh doctor.pr_event_without_zh "PR 一开即触发 AI 评审。"
30
30
  _i18n_set en doctor.pr_event_secret "Then set the API key secret for your configured agent in GitHub repo settings."
31
31
  _i18n_set zh doctor.pr_event_secret "然后在 GitHub 仓库设置中添加你配置的 agent 对应的 API key secret。"
32
+
33
+ _i18n_set en doctor.agent_detection "Agent detection"
34
+ _i18n_set zh doctor.agent_detection "Agent 检测"
35
+ _i18n_set en doctor.agent_installed "CLI on PATH"
36
+ _i18n_set zh doctor.agent_installed "CLI 可用"
37
+ _i18n_set en doctor.agent_missing "CLI not found"
38
+ _i18n_set zh doctor.agent_missing "CLI 未安装"
39
+ _i18n_set en doctor.agent_dir_exists "config dir exists"
40
+ _i18n_set zh doctor.agent_dir_exists "配置目录存在"
41
+ _i18n_set en doctor.agent_dir_missing "config dir missing"
42
+ _i18n_set zh doctor.agent_dir_missing "配置目录不存在"
43
+ _i18n_set en doctor.agent_primary_label "primary"
44
+ _i18n_set zh doctor.agent_primary_label "默认"
package/lib/i18n/loop.sh CHANGED
@@ -3,22 +3,22 @@ _i18n_set en loop.loop_already_enabled_for_this_project "Loop already enabled fo
3
3
  _i18n_set zh loop.loop_already_enabled_for_this_project "当前项目 loop 已启用"
4
4
  _i18n_set en loop.loop_enabled "Loop enabled"
5
5
  _i18n_set zh loop.loop_enabled "已启用"
6
- _i18n_set en loop.roll_loop_s_active_02d_00 " • roll-loop %s active %02d:00–%02d:00 %s"
7
- _i18n_set zh loop.roll_loop_s_active_02d_00 "窗口 %02d:00–%02d:00)\n"
8
- _i18n_set en loop.roll_dream_daily_at_02d_02d " • roll-.dream daily at %02d:%02d"
9
- _i18n_set zh loop.roll_dream_daily_at_02d_02d "每天 %02d:%02d\n"
10
- _i18n_set en loop.roll_brief_daily_at_02d_02d " • roll-brief daily at %02d:%02d"
11
- _i18n_set zh loop.roll_brief_daily_at_02d_02d "每天 %02d:%02d\n"
6
+ _i18n_set en loop.roll_loop_s_active_02d_00 " • roll-loop %s active %02d:00–%02d:00 %s(窗口 %02d:00–%02d:00)"
7
+ _i18n_set zh loop.roll_loop_s_active_02d_00 " roll-loop %s 有效窗口 %02d:00–%02d:00 %s(active %02d:00–%02d:00)"
8
+ _i18n_set en loop.roll_dream_daily_at_02d_02d " • roll-.dream daily at %02d:%02d 每天 %02d:%02d"
9
+ _i18n_set zh loop.roll_dream_daily_at_02d_02d " • roll-.dream daily at %02d:%02d 每天 %02d:%02d"
10
+ _i18n_set en loop.roll_brief_daily_at_02d_02d " • roll-brief daily at %02d:%02d 每天 %02d:%02d"
11
+ _i18n_set zh loop.roll_brief_daily_at_02d_02d " • roll-brief daily at %02d:%02d 每天 %02d:%02d"
12
12
  _i18n_set en loop.loop_already_enabled_for_this_project_2 "Loop already enabled for this project"
13
13
  _i18n_set zh loop.loop_already_enabled_for_this_project_2 "当前项目 loop 已启用"
14
14
  _i18n_set en loop.loop_enabled_2 "Loop enabled"
15
15
  _i18n_set zh loop.loop_enabled_2 "已启用"
16
- _i18n_set en loop.roll_loop_s_active_02d_00_2 " • roll-loop %s active %02d:00–%02d:00 %s"
17
- _i18n_set zh loop.roll_loop_s_active_02d_00_2 "窗口 %02d:00–%02d:00)\n"
18
- _i18n_set en loop.roll_dream_daily_at_02d_02d_2 " • roll-.dream daily at %02d:%02d"
19
- _i18n_set zh loop.roll_dream_daily_at_02d_02d_2 "每天 %02d:%02d\n"
20
- _i18n_set en loop.roll_brief_daily_at_02d_02d_2 " • roll-brief daily at %02d:%02d"
21
- _i18n_set zh loop.roll_brief_daily_at_02d_02d_2 "每天 %02d:%02d\n"
16
+ _i18n_set en loop.roll_loop_s_active_02d_00_2 " • roll-loop %s active %02d:00–%02d:00 %s(窗口 %02d:00–%02d:00)"
17
+ _i18n_set zh loop.roll_loop_s_active_02d_00_2 " roll-loop %s 有效窗口 %02d:00–%02d:00 %s(active %02d:00–%02d:00)"
18
+ _i18n_set en loop.roll_dream_daily_at_02d_02d_2 " • roll-.dream daily at %02d:%02d 每天 %02d:%02d"
19
+ _i18n_set zh loop.roll_dream_daily_at_02d_02d_2 " • roll-.dream daily at %02d:%02d 每天 %02d:%02d"
20
+ _i18n_set en loop.roll_brief_daily_at_02d_02d_2 " • roll-brief daily at %02d:%02d 每天 %02d:%02d"
21
+ _i18n_set zh loop.roll_brief_daily_at_02d_02d_2 " • roll-brief daily at %02d:%02d 每天 %02d:%02d"
22
22
  _i18n_set en loop.loop_not_enabled_for_this_project "Loop not enabled for this project"
23
23
  _i18n_set zh loop.loop_not_enabled_for_this_project "当前项目 loop 未启用"
24
24
  _i18n_set en loop.loop_disabled "Loop disabled"
@@ -0,0 +1,35 @@
1
+ > **Draft** — auto-generated by roll-doc on 2026-05-28. Review before treating as authoritative.
2
+
3
+ # lib/prices/ — Model price snapshots
4
+
5
+ Dated JSON snapshots of AI model list prices, used by `roll loop status` to compute per-cycle cost in both USD and native currency (CNY for pi/DeepSeek/Kimi).
6
+
7
+ ## Files
8
+
9
+ | File | Contents |
10
+ |------|---------|
11
+ | `snapshot-2026-05-22.json` | Multi-vendor snapshot (Claude, GPT, Gemini, DeepSeek, Kimi, pi) |
12
+ | `snapshot-2026-05-23-deepseek.json` | DeepSeek-specific refresh |
13
+ | `snapshot-2026-05-23-kimi.json` | Kimi-specific refresh |
14
+
15
+ ## Format
16
+
17
+ Each snapshot is a JSON object keyed by model ID:
18
+
19
+ ```json
20
+ {
21
+ "claude-opus-4-7": {
22
+ "input_per_mtok": 15.0,
23
+ "output_per_mtok": 75.0,
24
+ "cache_write_per_mtok": 18.75,
25
+ "cache_read_per_mtok": 1.5,
26
+ "currency": "USD"
27
+ }
28
+ }
29
+ ```
30
+
31
+ CNY-priced models (pi, DeepSeek, Kimi) use `"currency": "CNY"`.
32
+
33
+ ## Refresh
34
+
35
+ `prices_fetcher.py` fetches fresh snapshots from vendor pricing APIs and writes a new dated file here. Run via `roll prices refresh`.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@seanyao/roll",
3
- "version": "2026.528.1",
3
+ "version": "2026.529.1",
4
4
  "description": "Roll — Roll out features with AI agents",
5
5
  "scripts": {
6
6
  "test": "bash tests/run.sh"
@@ -125,16 +125,26 @@ readers. The rule mirrors the gate in Step 2.
125
125
  ### Step 1.5 — Pre-run CI Health Check
126
126
 
127
127
  Call `roll loop precheck-ci` before scanning BACKLOG. This is a **defensive gate**
128
- against building on a broken base if the most recent commit on the branch
129
- has red CI, the loop must not stack new commits on top (which would create the
130
- exact stuck-red state FIX-026 traces to).
131
-
132
- - HEAD CI green / pending / no-run-yet proceed to Step 2.
133
- - HEAD CI red write ALERT, **do not pick up any stories this cycle**,
134
- exit cleanly. The next cycle will retry; the human must fix CI manually
135
- (typically by reverting or pushing a green commit) before the loop resumes.
136
- - `gh` missing or repo unparseable → graceful skip (`roll loop precheck-ci`
137
- returns 0); the post-build `_loop_enforce_ci` remains the strict gate.
128
+ against building on a broken base. Check the **exit code** and route accordingly:
129
+
130
+ | Exit code | Meaning | Action |
131
+ |-----------|---------|--------|
132
+ | `0` | CI green / pending / unknown | Proceed to Step 1.6 (PR Inbox) and Step 2 (BACKLOG scan) |
133
+ | `1` | CI red AND heal exhausted or `ROLL_LOOP_NO_HEAL=1` | ALERT already written; exit cleanly this cycle |
134
+ | `2` | CI red AND heal attempt allowed (US-LOOP-046) | **Hot-fix path** — skip BACKLOG, fix CI instead (see below) |
135
+
136
+ `gh` missing or repo unparseable → `precheck-ci` returns `0`; graceful skip.
137
+
138
+ **Hot-fix path (exit code 2) — US-LOOP-046:**
139
+
140
+ Do NOT pick any BACKLOG stories this cycle. Instead:
141
+
142
+ 1. Capture context: `roll loop hotfix-head-context` → prints path to context log
143
+ 2. Invoke `Skill("roll-fix")` with brief:
144
+ `"CI red on HEAD. Failing run logs at <context-log-path>. Diagnose root cause, fix via TCR, commit, push. Do NOT change BACKLOG status."`
145
+ 3. After `roll-fix` completes, re-run `roll ci --wait` to verify the fix
146
+ 4. If CI is still red: run `roll loop precheck-ci` again; if it returns `1` (heal exhausted),
147
+ exit cleanly — ALERT was already written by the precheck
138
148
 
139
149
  ### Step 1.6 — PR Inbox (US-AUTO-034)
140
150