okstra 0.41.0 → 0.43.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -239,6 +239,7 @@ per-process 환경 변수에 task 정체성·경로·workflow 상태를 보관
239
239
  - standard workflow의 기본 worker role은 `Claude worker`, `Codex worker`, `Report writer worker`이며, `Gemini worker`는 `--workers` 또는 프로필에서 명시할 때만 포함되는 옵션입니다.
240
240
  - worker 역할 분담과 최종 판단은 Claude가 task bundle을 읽고 수행합니다.
241
241
  - 사용자 홈에 설치된 okstra Claude assets(`~/.claude/skills`, `~/.claude/agents`) 는 Agent Teams 를 우선 시도하고, 팀 구성이 불가능할 때만 sequential/background fallback 을 사용하도록 Claude 를 유도합니다.
242
+ - **팀 lifecycle**: lead 는 Phase 3 에서 `TeamCreate(team_name: "okstra-<task-key>")` 로 팀을 만들고 워커를 그 멤버로 dispatch 합니다. run 종료 시(Phase 7 토큰 집계 **이후**, 자동·무프롬프트) lead 는 팀 config 의 멤버에게 `SendMessage({type: "shutdown_request"})` 로 graceful 종료를 보낸 뒤 `TeamDelete` 로 팀을 해제합니다 — `TeamDelete` 는 active member 가 남아 있으면 실패하므로 종료 확인 후 호출하며, `~/.claude/teams/<team>/`·`~/.claude/tasks/<team>/` 만 지우고 토큰 집계 소스인 `~/.claude/projects/` jsonl 은 보존합니다. teardown 이 없으면 worker teammate 가 FleetView roster 에 계속 누적됩니다 (`prompts/profiles/_common-contract.md` 의 *Run-end team teardown*). no-`team_name` fallback 에서는 팀이 없으므로 silent-skip.
242
243
 
243
244
  ## Claude prompt contract
244
245
 
@@ -909,10 +910,10 @@ Phase 7 step 1.5 가 final-report MD 한 본을 입력으로 두 view 를 결정
909
910
  ### Live-log mirror (codex / gemini wrapper)
910
911
 
911
912
  - `scripts/okstra-codex-exec.sh`, `scripts/okstra-gemini-exec.sh` 는 dispatch 마다 prompt path 옆에 `<prompt>.log` sidecar 를 만들고 stdout 을 거기로 mirror 합니다 (`tee`, `PIPESTATUS[0]` 로 종료코드 보존). stderr 은 같은 파일에 append (subagent stderr 캡처 contract 보존), 매 dispatch 시 truncate. 호출 subagent 의 `BashOutput` 폴링은 60s 간격이라 long-running run (analysis 의 large-codebase scan, implementation 의 cargo / pytest) 동안 사용자가 stalled state 를 탐지할 수 없는 문제를 해소합니다.
912
- - `$TMUX`셋팅된 lead 환경이면 wrapper 가 sibling pane 을 자동 분할해 `tail -F <log-path>` 를 띄웁니다. trace pane title 은 `<cli>-<role>-<pid>-trace[from=<caller-pane-id>]` (e.g. `codex-worker-93421-trace[from=%5]`, `gemini-executor-93422-trace[from=%5]`); 동일 시점에 caller (worker) pane title `<cli>-<role>-<pid>` 로 셋팅됩니다. `<pid>` 는 wrapper 자기 자신의 PID 라서 동일 role 의 worker 가 둘 이상 동시에 spawn 돼도 서로 구분되며, trace title 박힌 caller pane id 덕분에 worker pane title 외부에서 덮어써져도 (Claude Code TUI OSC 2 escape 자기 pane title 지속 emit) trace worker 매핑이 깨지지 않습니다. caller pane id 우선 `$TMUX_PANE` 에서, 비어 있으면 `tmux display-message -p '#{pane_id}'` (active pane) 으로 fallback — Claude Code Bash tool 환경처럼 `$TMUX_PANE` 가 stripping 돼도 caller pane 을 정확히 잡습니다. trace pane split caller pane 을 `-t` 로 명시 anchor 합니다. role 은 wrapper 의 5번째 optional positional 인자이며, 누락 시 기본값 `worker` 로 떨어집니다. caller 가 다른 라벨(예: `executor`)을 원하면 5번째 인자로 명시해야 합니다. wrapper 진입 직전의 caller pane title 은 capture 해두고 EXIT trap 에서 복원하므로, dispatch 사이의 stale title 이 남지 않습니다. focus 는 caller pane 으로 복귀하고, CLI 종료 후 pane 은 유지돼 스크롤백 가능. `$TMUX` 미설정, split 실패, 구버전 tmux 등 모든 경로는 silent degrade.
913
- - **Claude `/exit` 시 자동 정리**: trace pane 의 `tail -F` 는 tmux 셸의 자식이라 Claude 가 종료돼도 살아남는 문제를 막기 위해, wrapper 는 spawn 한 pane id caller `$TMUX_PANE` 으로 키된 registry (`${TMPDIR:-/tmp}/okstra-trace-panes/<caller-pane>.list`) 에 append 합니다. `templates/reports/settings.template.json` `hooks.SessionEnd` `$HOME/.okstra/bin/okstra-trace-cleanup.sh` 호출해 자신의 caller pane registry 읽어 `tmux kill-pane` 합니다. caller pane 단위로 scope잡혀 있어 같은 tmux 세션에 Claude 인스턴스가 여러 있어도 서로의 trace pane 죽이지 않습니다. tmux 가 없거나 stale pane id 인 경우 silent degrade.
914
- - **phase 전환 시 자동 정리 + worker-agent pane 포함**: `okstra-trace-cleanup.sh` 는 trace pane(registry) 뿐 아니라 dispatch 된 서브에이전트가 점유하는 worker-agent pane(title `claude-worker` / `codex-worker` / `gemini-worker` / `report-writer-worker`)도 lead 세션(`tmux list-panes -s`) 범위에서 title allowlist 로 식별해 닫습니다. lead 자신의 pane(`$TMUX_PANE`)은 title 이 걸려도 절대 죽이지 않습니다. lead 는 새 phase 의 worker 를 dispatch 하기 직전(`PROGRESS: phase-5.5-convergence` / `phase-6-synthesis` 마커 직전) 이 스크립트를 무인자로 호출해 이전 phase 의 pane 을 prompt 없이 정리합니다.
915
- - **Phase 종료 시 사용자 확인**: run 최종 종료 시점(마지막 단계)에 lead 가 `okstra-trace-cleanup.sh --list` 로 잔여 okstra pane(worker-agent + trace) 목록을 출력한 뒤 사용자에게 "모두 닫기 / 그대로 두기" 양자택일을 묻고 응답대로 처리합니다 (`prompts/profiles/_common-contract.md` 의 *Phase wrap-up* 항목). `$TMUX_PANE` 미설정 환경에서는 단계 자체가 silent-skip. `--list` 모드는 pane 을 죽이지 않고 `<pane_id>\t<pane_title>` 만 출력하므로 사용자가 무엇이 닫힐지 시각적으로 확인할 수 있습니다.
913
+ - tmuxreachable lead 환경이면 wrapper 가 sibling pane 을 자동 분할해 `tail -F <log-path>` 를 띄웁니다. trace pane title 은 caller (worker) pane title 에 `-tail` 을 붙인 `<cli>-<role>-<pid>-tail` (e.g. `codex-worker-93421-tail`); 동일 시점에 caller (worker) pane title `<cli>-<role>-<pid>` 로 셋팅됩니다. `<pid>` 는 wrapper 자기 자신의 PID 라서 동일 role 의 worker 가 둘 이상 동시에 spawn 돼도 서로 구분되고, 운영자는 `<caller> <caller>-tail` 로 시각적으로 매핑할 수 있습니다. **caller pane 해석** Claude Code Bash tool 이제 `$TMUX` 와 `$TMUX_PANE` 를 둘 다 환경에서 제거하므로 env 변수에 의존하지 않습니다. wrapper 는 (1) prompt path 로부터 `<RUN_DIR>` (= `dirname(dirname(prompt_path))`, paths.py SSOT) 를 도출하고, (2) lead 자기 foreground pane 에서 1회 기록한 `<RUN_DIR>/state/lead-pane.id` 읽어 split anchor 씁니다 (background dispatch 에서도 신뢰 가능 — active-pane 추정과 달리 사용자가 pane 옮겨도 안전). 기록 파일이 없거나 pane 이 stale 이면 `tmux display-message -p '#{pane_id}'` (active pane) 으로 fallback. trace pane split caller pane 을 `-t` 로 명시 anchor 합니다. role 은 wrapper 의 5번째 optional positional 인자이며, 누락 시 기본값 `worker`. caller pane title 은 capture 해두고 EXIT trap 에서 복원하므로 dispatch 사이의 stale title 이 남지 않습니다. focus 는 caller pane 으로 복귀하고, CLI 종료 후 pane 은 유지돼 스크롤백 가능. tmux 미reachable, split 실패, 구버전 tmux 등 모든 경로는 silent degrade.
914
+ - **run-scoped 태깅으로 정리**: trace pane 의 `tail -F` 는 tmux 셸의 자식이라 Claude 가 종료돼도 살아남습니다. wrapper 는 spawn 한 pane `tmux set-option -p @okstra_trace_run=<RUN_DIR>` 태깅하고, `okstra-trace-cleanup.sh` `tmux list-panes -a` 에서 태그로 pane server-wide 발견해 `tmux kill-pane` 합니다. tmux env 변수·pane-id registry 없이 동작하며, run-scoped 태그라 동시에 도는 다른 okstra run 의 trace pane 을 죽이지 않습니다. cleanup 은 두 진입 형태를 가집니다 — lead `--run-dir <RUN_DIR>` 호출(해당 run trace + worker-agent pane 정리)하거나, `templates/reports/settings.template.json` 의 `hooks.SessionEnd` 가 `--reap` 로 호출(`$CLAUDE_PROJECT_DIR/.okstra/` 하위 태그를 가진 trace pane 일괄 정리; 단일 run-dir 이 없는 종료 시점용). tmux 가 없거나 stale pane id 인 경우 silent degrade.
915
+ - **phase 전환 시 자동 정리 + worker-agent pane 포함**: `okstra-trace-cleanup.sh --run-dir <RUN_DIR>` 태깅된 trace pane 뿐 아니라 dispatch 된 서브에이전트가 점유하는 worker-agent pane(title `claude-worker` / `codex-worker` / `gemini-worker` / `report-writer-worker`)도 lead 세션(`tmux list-panes -s -t <lead-pane>`) 범위에서 title allowlist 로 식별해 닫습니다(worker-agent pane 은 harness 소유라 태깅 불가). 세션 scope 와 lead 자기 pane 제외는 `<RUN_DIR>/state/lead-pane.id` 로 결정되며, lead 자신의 pane 은 title 이 걸려도 절대 죽이지 않습니다. lead 는 새 phase 의 worker 를 dispatch 하기 직전(`PROGRESS: phase-5.5-convergence` / `phase-6-synthesis` 마커 직전) 이 스크립트를 `--run-dir` 호출해 이전 phase 의 pane 을 prompt 없이 정리합니다.
916
+ - **Phase 종료 시 사용자 확인**: run 최종 종료 시점(마지막 단계)에 lead 가 `okstra-trace-cleanup.sh --list --run-dir <RUN_DIR>` 로 잔여 okstra pane(worker-agent + trace) 목록을 출력한 뒤 사용자에게 "모두 닫기 / 그대로 두기" 양자택일을 묻고 응답대로 처리합니다 (`prompts/profiles/_common-contract.md` 의 *Phase wrap-up* 항목). `<RUN_DIR>/state/lead-pane.id` 비어 있는(=tmux 밖) 환경에서는 단계 자체가 silent-skip. `--list` 모드는 pane 을 죽이지 않고 `<pane_id>\t<pane_title>` 만 출력하므로 사용자가 무엇이 닫힐지 시각적으로 확인할 수 있습니다.
916
917
  - 디스크 누적은 `okstra-logs` skill 이 read-only 로 인벤토리 + cleanup 명령을 제안합니다 (실행은 사용자 copy-paste).
917
918
 
918
919
  ### Linked-worktree `.git/` write 권한 (codex / gemini)
package/docs/kr/cli.md CHANGED
@@ -591,4 +591,4 @@ chmod +x ~/.local/bin/okstra-ctl
591
591
 
592
592
  ### Live-log sidecar
593
593
 
594
- codex / gemini wrapper 는 매 dispatch 마다 `runs/<task-type>/prompts/<worker>-prompt-<phase>-<seq>.log` sidecar 를 만들고 stdout / stderr 를 mirror 합니다. tmux 안에서 lead 를 띄우면 wrapper 가 자동으로 `tail -F` pane 을 분할합니다 (trace pane title: `<cli>-<role>-<pid>-trace`, caller (worker) pane title: `<cli>-<role>-<pid>` — wrapper PID 가 동일 role 의 동시 dispatch 를 구분합니다). 분할된 trace pane 은 caller `$TMUX_PANE` 으로 키된 registry 에 등록돼, Claude `/exit` 시 `SessionEnd` 훅이 `okstra-trace-cleanup.sh` 로 자동 정리합니다. 같은 스크립트는 dispatch 된 worker-agent pane(title `claude-worker` / `codex-worker` / `gemini-worker` / `report-writer-worker`) lead 세션 범위에서 함께 정리하며(lead 자신의 pane 은 제외), lead 는 새 phase dispatch 직전 이를 호출해 이전 phase 의 okstra pane 을 자동 정리합니다. 사용량 인벤토리와 `find … -delete` cleanup 명령은 `okstra-logs` skill 이 read-only 로 제안합니다. 자세한 와이어링은 [`docs/kr/architecture.md`](architecture.md) 의 *Live-log mirror* 절 참고.
594
+ codex / gemini wrapper 는 매 dispatch 마다 `runs/<task-type>/prompts/<worker>-prompt-<phase>-<seq>.log` sidecar 를 만들고 stdout / stderr 를 mirror 합니다. tmux 안에서 lead 를 띄우면 wrapper 가 자동으로 `tail -F` pane 을 분할합니다 (trace pane title: `<cli>-<role>-<pid>-tail`, caller (worker) pane title: `<cli>-<role>-<pid>` — wrapper PID 가 동일 role 의 동시 dispatch 를 구분합니다). 분할된 trace pane 은 `@okstra_trace_run=<RUN_DIR>` pane user-option 으로 태깅돼, Claude `/exit` 시 `SessionEnd` 훅이 `okstra-trace-cleanup.sh --reap` 로 (`$CLAUDE_PROJECT_DIR/.okstra/` scope) 자동 정리합니다. 같은 스크립트를 lead 가 `--run-dir <RUN_DIR>` 로 호출하면 그 run 의 trace pane + dispatch 된 worker-agent pane(title `claude-worker` / `codex-worker` / `gemini-worker` / `report-writer-worker`) lead 세션 범위에서 함께 정리하며(lead 자신의 pane 은 제외), lead 는 새 phase dispatch 직전 이를 호출해 이전 phase 의 okstra pane 을 자동 정리합니다. 사용량 인벤토리와 `find … -delete` cleanup 명령은 `okstra-logs` skill 이 read-only 로 제안합니다. 자세한 와이어링은 [`docs/kr/architecture.md`](architecture.md) 의 *Live-log mirror* 절 참고.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "okstra",
3
- "version": "0.41.0",
3
+ "version": "0.43.0",
4
4
  "description": "Multi-agent cross-verification orchestrator runtime + Claude Code skills.",
5
5
  "license": "MIT",
6
6
  "author": "devonshin",
@@ -1,5 +1,5 @@
1
1
  {
2
- "package": "0.41.0",
3
- "builtAt": "2026-06-02T15:52:15.071Z",
2
+ "package": "0.43.0",
3
+ "builtAt": "2026-06-04T04:59:06.499Z",
4
4
  "repoRoot": "/home/runner/work/okstra/okstra"
5
5
  }
@@ -43,7 +43,7 @@ This SKILL.md is the operating contract and phase index. Detailed procedures liv
43
43
  | 5. Fallback | Sequential/background dispatch when Teams unavailable | `okstra-team-contract` |
44
44
  | 5.5 Convergence | Cross-verify findings across workers | `okstra-convergence` |
45
45
  | 6. Synthesis | Dispatch Report writer worker, review draft. **For `implementation-planning`: then run the Phase 6 plan-body verification sub-step (see Phase 6 section below).** | `okstra-report-writer` + `okstra-convergence` (sub-step) |
46
- | 7. Persist | Run token-usage collector, update manifests | `okstra-report-writer` |
46
+ | 7. Persist | Run token-usage collector, update manifests, then disband the worker team (shutdown teammates + `TeamDelete`, after collection) | `okstra-report-writer` + `_common-contract.md` "Run-end team teardown" |
47
47
 
48
48
  ## Core operating contract
49
49
 
@@ -94,6 +94,7 @@ Required checkpoints:
94
94
  - `PROGRESS: phase-5.5-convergence round=<N> queue=<count>` — at the start of each convergence round (Phase 5.5).
95
95
  - `PROGRESS: phase-6-synthesis dispatching report-writer-worker` — at the start of Phase 6.
96
96
  - `PROGRESS: phase-7-persist updating manifests` — at the start of Phase 7.
97
+ - `PROGRESS: phase-7-teardown disbanding team` — after token-usage collection, immediately before shutting down worker teammates + `TeamDelete` (Teams mode only; see `_common-contract.md` "Run-end team teardown"). Skipped in the no-`team_name` fallback.
97
98
  - `PROGRESS: complete final-report=<relative-path>` — final summary line, after all persistence.
98
99
 
99
100
  These lines are the only structured signal the user has during a long run. Do NOT replace them with prose ("Now I'm starting Phase 2..."), do NOT skip a checkpoint because "the previous message already said that", and do NOT batch multiple checkpoints into one. Each line stands alone so the user (or any operator scraping stdout) can timestamp it externally.
@@ -30,7 +30,7 @@ You are a Codex worker agent. Your job is to execute the OpenAI Codex CLI and re
30
30
  $HOME/.okstra/bin/okstra-codex-exec.sh "<absolute-project-root>" "<assigned-model-execution-value>" "<absolute-prompt-history-path>" [<absolute-worktree-path>] [<role>]
31
31
  ```
32
32
 
33
- The fifth argument `<role>` is folded into both the caller (worker) pane title `codex-<role>-<pid>` and the sibling trace-pane title `codex-<role>-<pid>-trace` (`<pid>` = the wrapper's PID, present so concurrent dispatches of the same role can be told apart). Pass the literal string `worker` for every dispatch from this subagent. The wrapper defaults to `worker` when the argument is omitted, but pass it explicitly so the dispatch is self-describing.
33
+ The fifth argument `<role>` is folded into both the caller (worker) pane title `codex-<role>-<pid>` and the sibling trace-pane title `codex-<role>-<pid>-tail` (`<pid>` = the wrapper's PID, present so concurrent dispatches of the same role can be told apart). Pass the literal string `worker` for every dispatch from this subagent. The wrapper defaults to `worker` when the argument is omitted, but pass it explicitly so the dispatch is self-describing.
34
34
 
35
35
  The fourth argument is **mandatory for implementation phase** and optional otherwise. It must be the literal `EXECUTOR_WORKTREE_PATH` recorded in the run context; the wrapper forwards it to codex as `--add-dir`, which grants the codex sandbox write access to the worktree (where all implementation-phase mutations occur). Without it, codex's `workspace-write` sandbox is anchored only at `<project-root>` and rejects every Edit/Write that targets the worktree (EPERM), which is the failure pattern that originally motivated this argument.
36
36
 
@@ -30,7 +30,7 @@ You are a Gemini worker agent. Your job is to execute the Google Gemini CLI and
30
30
  $HOME/.okstra/bin/okstra-gemini-exec.sh "<absolute-project-root>" "<assigned-model-execution-value>" "<absolute-prompt-history-path>" [<absolute-worktree-path>] [<role>]
31
31
  ```
32
32
 
33
- The fifth argument `<role>` is folded into both the caller (worker) pane title `gemini-<role>-<pid>` and the sibling trace-pane title `gemini-<role>-<pid>-trace` (`<pid>` = the wrapper's PID, present so concurrent dispatches of the same role can be told apart). Pass the literal string `worker` for every dispatch from this subagent. The wrapper defaults to `worker` when the argument is omitted, but pass it explicitly so the dispatch is self-describing.
33
+ The fifth argument `<role>` is folded into both the caller (worker) pane title `gemini-<role>-<pid>` and the sibling trace-pane title `gemini-<role>-<pid>-tail` (`<pid>` = the wrapper's PID, present so concurrent dispatches of the same role can be told apart). Pass the literal string `worker` for every dispatch from this subagent. The wrapper defaults to `worker` when the argument is omitted, but pass it explicitly so the dispatch is self-describing.
34
34
 
35
35
  The fourth argument is **mandatory for implementation phase** and optional otherwise. It must be the literal `EXECUTOR_WORKTREE_PATH` recorded in the run context; the wrapper appends it to gemini's `--include-directories` list so the model can both read and operate on the worktree alongside project-root.
36
36
 
@@ -187,28 +187,40 @@ python3 "$script_dir/okstra-wrapper-status.py" \
187
187
  init "$status_path" "$(basename "$0")" "$role" "$$" "$started_ts" "$log_path" \
188
188
  >>"$log_path" 2>&1 || true
189
189
 
190
- # Resolve caller pane id robustly. tmux normally exports both `$TMUX` and
191
- # `$TMUX_PANE` to processes started inside a pane, but Claude Code's Bash
192
- # tool can drop `$TMUX_PANE` while preserving `$TMUX` which would
193
- # silently skip the caller-pane rename below AND let `tmux split-window`
194
- # attach the trace pane to whatever tmux currently considers active
195
- # (not necessarily Claude's pane). When the wrapper is launched from
196
- # Claude Code, the Claude session's pane IS the active pane at this
197
- # moment, so falling back to `display-message -p '#{pane_id}'` recovers
198
- # the correct id.
190
+ # Derive the okstra run dir from the prompt path. paths.py is the SSOT:
191
+ # dispatched prompts live at `<RUN_DIR>/prompts/<cli>-worker-prompt<NNN>.md`,
192
+ # so the run dir is two levels up. Used to (a) read the lead pane the lead
193
+ # recorded in its own foreground pane and (b) tag the trace pane so cleanup
194
+ # can find exactly this run's panes without any tmux env var. Empty if the
195
+ # derivation fails every dependent step below then degrades to a no-op.
196
+ run_dir="$(cd "$(dirname "$prompt_path")/.." 2>/dev/null && pwd -P || true)"
197
+ lead_pane_file="${run_dir:+$run_dir/state/lead-pane.id}"
198
+
199
+ # Resolve the pane to anchor the trace split to. Claude Code's Bash tool now
200
+ # strips BOTH `$TMUX` and `$TMUX_PANE`, and this wrapper frequently runs
201
+ # backgrounded — so the bare active-pane probe can land on whatever pane the
202
+ # user happens to be looking at now, not Claude's. Prefer the lead pane the
203
+ # lead captured ONCE in its own foreground pane (reliable, see
204
+ # `_common-contract.md`); fall back to `$TMUX_PANE`, then the active-pane
205
+ # probe. A stale recorded id (pane since closed) is rejected via a liveness
206
+ # check so we never anchor the split to a dead pane.
199
207
  caller_pane="${TMUX_PANE:-}"
200
- if [[ -z "$caller_pane" && -n "${TMUX:-}" ]]; then
208
+ if [[ -z "$caller_pane" && -n "$lead_pane_file" && -r "$lead_pane_file" ]]; then
209
+ cand="$(head -n1 "$lead_pane_file" 2>/dev/null || true)"
210
+ if [[ -n "$cand" ]] && tmux display-message -p -t "$cand" '#{pane_id}' >/dev/null 2>&1; then
211
+ caller_pane="$cand"
212
+ fi
213
+ fi
214
+ if [[ -z "$caller_pane" ]]; then
201
215
  caller_pane=$(tmux display-message -p '#{pane_id}' 2>/dev/null || true)
202
216
  fi
203
217
 
204
218
  # Pane titles: worker (caller) pane gets `codex-<role>-<pid>`; the sibling
205
- # trace pane appends `-trace[from=<caller-pane-id>]`. The wrapper PID
206
- # disambiguates concurrent dispatches of the same role; the embedded
207
- # caller pane id keeps the trace worker mapping visible even if the
208
- # worker pane's title is later overwritten by the parent process (e.g.
209
- # Claude Code's TUI emitting OSC 2 escape sequences on its own pane).
219
+ # trace pane is that same caller title with a `-tail` suffix, so the
220
+ # operator can visually pair `<caller> <caller>-tail`. The wrapper PID in
221
+ # the caller title disambiguates concurrent dispatches of the same role.
210
222
  pane_label="codex-${role}-$$"
211
- trace_label="${pane_label}-trace[from=${caller_pane:-?}]"
223
+ trace_label="${pane_label}-tail"
212
224
 
213
225
  # Capture the caller pane's current title so the EXIT trap can restore it
214
226
  # once the wrapper returns. Empty when not in tmux or capture fails — the
@@ -243,42 +255,35 @@ fi
243
255
  # for the wrapper to exit. This fires in every phase the wrapper is invoked
244
256
  # from (analysis, error-analysis, implementation-planning, implementation,
245
257
  # …) — long-running codex dispatches are not implementation-specific. The
246
- # new pane carries the title `codex-<role>-<pid>-trace[from=<caller-pane>]`
247
- # so the operator can map trace ↔ worker by pane id even when the worker
248
- # pane title is later overwritten by Claude Code. The split is explicitly
249
- # anchored to the caller pane (`-t "$caller_pane"`) to avoid attaching to
250
- # tmux's idle active pane when `$TMUX_PANE` was missing. `role` is the
251
- # optional 5th positional arg (defaults to `worker`); callers that
252
- # dispatch a different role (e.g. `executor`) must pass it explicitly.
253
- # The `<pid>` suffix is the wrapper's PID and disambiguates concurrent
254
- # dispatches of the same role. The pane uses `tail -F` (follow-by-name)
255
- # so it survives any truncation a re-dispatch performs on the same log
256
- # path. Failures are tolerated silently: missing $TMUX, a tmux that
257
- # refuses to split (size constraints, locked client), or a stale socket
258
- # all degrade to "log file is still on disk; the operator can tail it
259
- # manually from any terminal." The wrapper does NOT switch focus to the
258
+ # new pane carries the title `codex-<role>-<pid>-tail` so the operator can
259
+ # pair it with its caller pane (`codex-<role>-<pid>`). The split is
260
+ # explicitly anchored to the caller pane (`-t "$caller_pane"`) to avoid
261
+ # attaching to tmux's idle active pane. `role` is the optional 5th
262
+ # positional arg (defaults to `worker`); callers that dispatch a different
263
+ # role (e.g. `executor`) must pass it explicitly. The `<pid>` suffix is the
264
+ # wrapper's PID and disambiguates concurrent dispatches of the same role.
265
+ # The pane uses `tail -F` (follow-by-name) so it survives any truncation a
266
+ # re-dispatch performs on the same log path. We gate on a resolved
267
+ # `$caller_pane` (non-empty only when tmux is reachable) rather than the
268
+ # now-stripped `$TMUX`. Failures are tolerated silently: no tmux, a tmux
269
+ # that refuses to split (size constraints, locked client), or a stale
270
+ # socket all degrade to "log file is still on disk; the operator can tail
271
+ # it manually from any terminal." The wrapper does NOT switch focus to the
260
272
  # new pane — control returns to the caller's pane via `tmux last-pane`.
261
- if [[ -n "${TMUX:-}" ]]; then
262
- split_args=(-h -P -F '#{pane_id}' -c "$(dirname "$log_path")")
263
- if [[ -n "$caller_pane" ]]; then
264
- split_args+=(-t "$caller_pane")
265
- fi
273
+ if [[ -n "$caller_pane" ]]; then
274
+ split_args=(-h -P -F '#{pane_id}' -c "$(dirname "$log_path")" -t "$caller_pane")
266
275
  trace_pane=$(tmux split-window "${split_args[@]}" \
267
276
  "tail -F $(printf '%q' "$log_path")" 2>/dev/null || true)
268
277
  if [[ -n "$trace_pane" ]]; then
269
278
  tmux select-pane -t "$trace_pane" -T "$trace_label" 2>/dev/null || true
279
+ # Tag the spawned pane with THIS run's dir so `okstra-trace-cleanup.sh
280
+ # --run-dir <RUN_DIR>` (see that script + `_common-contract.md`) can find
281
+ # and close exactly this run's trace panes — discovered server-wide by
282
+ # tag, needing no tmux env var, no pane-id registry, and no active-pane
283
+ # assumption. The run-scoped tag also stops concurrent okstra runs from
284
+ # stomping each other's trace panes.
285
+ [[ -n "$run_dir" ]] && tmux set-option -p -t "$trace_pane" @okstra_trace_run "$run_dir" 2>/dev/null || true
270
286
  tmux last-pane 2>/dev/null || true
271
- # Register the spawned pane so the `SessionEnd` hook (see
272
- # `okstra-trace-cleanup.sh`) can kill it when the caller's Claude
273
- # session exits. Scope by `$caller_pane` — the pane Claude itself is
274
- # attached to — so concurrent Claude instances in the same tmux
275
- # session do not stomp each other's trace panes.
276
- if [[ -n "$caller_pane" ]]; then
277
- registry_dir="${TMPDIR:-/tmp}/okstra-trace-panes"
278
- mkdir -p "$registry_dir" 2>/dev/null || true
279
- safe_pane="${caller_pane//[^A-Za-z0-9]/_}"
280
- printf '%s\n' "$trace_pane" >> "$registry_dir/${safe_pane}.list" 2>/dev/null || true
281
- fi
282
287
  fi
283
288
  fi
284
289
 
@@ -136,24 +136,32 @@ python3 "$script_dir/okstra-wrapper-status.py" \
136
136
  init "$status_path" "$(basename "$0")" "$role" "$$" "$started_ts" "$log_path" \
137
137
  >>"$log_path" 2>&1 || true
138
138
 
139
- # Resolve caller pane id robustly. See `okstra-codex-exec.sh` for the full
140
- # rationale — kept in lock-step: tmux normally exports both `$TMUX` and
141
- # `$TMUX_PANE`, but Claude Code's Bash tool can drop `$TMUX_PANE` while
142
- # preserving `$TMUX`, which silently skips the caller-pane rename and
143
- # lets `tmux split-window` attach to whatever tmux considers active.
139
+ # Resolve the run dir and the trace-split anchor pane. See
140
+ # `okstra-codex-exec.sh` for the full rationale — kept in lock-step: derive
141
+ # `<RUN_DIR>` from the prompt path (paths.py SSOT) to read the lead-recorded
142
+ # pane and to tag the trace pane; prefer that lead pane over the unreliable
143
+ # active-pane probe (this wrapper runs backgrounded and `$TMUX`/`$TMUX_PANE`
144
+ # are stripped).
145
+ run_dir="$(cd "$(dirname "$prompt_path")/.." 2>/dev/null && pwd -P || true)"
146
+ lead_pane_file="${run_dir:+$run_dir/state/lead-pane.id}"
147
+
144
148
  caller_pane="${TMUX_PANE:-}"
145
- if [[ -z "$caller_pane" && -n "${TMUX:-}" ]]; then
149
+ if [[ -z "$caller_pane" && -n "$lead_pane_file" && -r "$lead_pane_file" ]]; then
150
+ cand="$(head -n1 "$lead_pane_file" 2>/dev/null || true)"
151
+ if [[ -n "$cand" ]] && tmux display-message -p -t "$cand" '#{pane_id}' >/dev/null 2>&1; then
152
+ caller_pane="$cand"
153
+ fi
154
+ fi
155
+ if [[ -z "$caller_pane" ]]; then
146
156
  caller_pane=$(tmux display-message -p '#{pane_id}' 2>/dev/null || true)
147
157
  fi
148
158
 
149
159
  # Pane titles: worker (caller) pane gets `gemini-<role>-<pid>`; the sibling
150
- # trace pane appends `-trace[from=<caller-pane-id>]`. The wrapper PID
151
- # disambiguates concurrent dispatches of the same role; the embedded
152
- # caller pane id keeps the trace worker mapping visible even if the
153
- # worker pane's title is later overwritten by the parent process (e.g.
154
- # Claude Code's TUI emitting OSC 2 escape sequences on its own pane).
160
+ # trace pane is that same caller title with a `-tail` suffix, so the
161
+ # operator can visually pair `<caller> <caller>-tail`. The wrapper PID in
162
+ # the caller title disambiguates concurrent dispatches of the same role.
155
163
  pane_label="gemini-${role}-$$"
156
- trace_label="${pane_label}-trace[from=${caller_pane:-?}]"
164
+ trace_label="${pane_label}-tail"
157
165
 
158
166
  # Capture the caller pane's current title so the EXIT trap can restore it
159
167
  # once the wrapper returns. Empty when not in tmux or capture fails — the
@@ -186,33 +194,26 @@ fi
186
194
  # When a tmux session is reachable, split a sibling pane tailing the log so
187
195
  # the operator can watch progress live. This fires in every phase the
188
196
  # wrapper is invoked from — long-running gemini dispatches are not
189
- # implementation-specific. Title `gemini-<role>-<pid>-trace[from=<caller-pane>]`
190
- # so the operator can map trace worker by pane id even when the worker
191
- # pane title is later overwritten by Claude Code. The split is explicitly
192
- # anchored to the caller pane to avoid attaching to tmux's idle active
193
- # pane when `$TMUX_PANE` was missing. `role` is the optional 5th
194
- # positional arg (defaults to `worker`); callers that dispatch a
195
- # different role must pass it explicitly. The `<pid>` suffix is the
196
- # wrapper's PID and disambiguates concurrent dispatches of the same role.
197
- # See the codex wrapper for the full design rationale and the
197
+ # implementation-specific. Title `gemini-<role>-<pid>-tail` so the operator
198
+ # can pair it with its caller pane (`gemini-<role>-<pid>`). The split is
199
+ # explicitly anchored to the caller pane to avoid attaching to tmux's idle
200
+ # active pane. `role` is the optional 5th positional arg (defaults to
201
+ # `worker`); callers that dispatch a different role must pass it explicitly.
202
+ # The `<pid>` suffix is the wrapper's PID and disambiguates concurrent
203
+ # dispatches of the same role. We gate on a resolved `$caller_pane`
204
+ # (non-empty only when tmux is reachable) rather than the now-stripped
205
+ # `$TMUX`. See the codex wrapper for the full design rationale and the
198
206
  # silent-degrade failure model.
199
- if [[ -n "${TMUX:-}" ]]; then
200
- split_args=(-h -P -F '#{pane_id}' -c "$(dirname "$log_path")")
201
- if [[ -n "$caller_pane" ]]; then
202
- split_args+=(-t "$caller_pane")
203
- fi
207
+ if [[ -n "$caller_pane" ]]; then
208
+ split_args=(-h -P -F '#{pane_id}' -c "$(dirname "$log_path")" -t "$caller_pane")
204
209
  trace_pane=$(tmux split-window "${split_args[@]}" \
205
210
  "tail -F $(printf '%q' "$log_path")" 2>/dev/null || true)
206
211
  if [[ -n "$trace_pane" ]]; then
207
212
  tmux select-pane -t "$trace_pane" -T "$trace_label" 2>/dev/null || true
213
+ # Tag with this run's dir for `okstra-trace-cleanup.sh --run-dir`. See
214
+ # `okstra-codex-exec.sh` for the rationale — kept in lock-step.
215
+ [[ -n "$run_dir" ]] && tmux set-option -p -t "$trace_pane" @okstra_trace_run "$run_dir" 2>/dev/null || true
208
216
  tmux last-pane 2>/dev/null || true
209
- # See `okstra-codex-exec.sh` for the registry rationale — kept in lock-step.
210
- if [[ -n "$caller_pane" ]]; then
211
- registry_dir="${TMPDIR:-/tmp}/okstra-trace-panes"
212
- mkdir -p "$registry_dir" 2>/dev/null || true
213
- safe_pane="${caller_pane//[^A-Za-z0-9]/_}"
214
- printf '%s\n' "$trace_pane" >> "$registry_dir/${safe_pane}.list" 2>/dev/null || true
215
- fi
216
217
  fi
217
218
  fi
218
219
 
@@ -1,93 +1,136 @@
1
1
  #!/usr/bin/env bash
2
2
  #
3
- # okstra-trace-cleanup.sh — manage tmux panes created during an okstra run for
4
- # the current Claude Code (lead) session.
3
+ # okstra-trace-cleanup.sh — close tmux panes created during okstra runs.
5
4
  #
6
- # Two pane sources are cleaned for the lead's session:
7
- # 1. Trace panes — `tail -F` siblings spawned by the codex/gemini wrappers
8
- # (`okstra-codex-exec.sh`, `okstra-gemini-exec.sh`). Tracked in a registry
9
- # keyed by the lead's `$TMUX_PANE`.
10
- # 2. Worker-agent panes panes the harness gives to dispatched okstra
11
- # subagents (`claude-worker`, `codex-worker`, `gemini-worker`,
12
- # `report-writer-worker`). Not registered anywhere by okstra; identified by
13
- # a title allowlist within the lead's tmux session.
5
+ # Trace panes are `tail -F` siblings spawned by the codex/gemini wrappers
6
+ # (`okstra-codex-exec.sh`, `okstra-gemini-exec.sh`). Each wrapper tags the pane
7
+ # it spawns with a pane-level user option `@okstra_trace_run=<RUN_DIR>`, so the
8
+ # panes are found server-wide by tag no tmux env var and no pane-id registry
9
+ # are needed, and the run-scoped tag keeps concurrent okstra runs from closing
10
+ # each other's panes.
14
11
  #
15
- # The lead's own pane (`$TMUX_PANE`) is NEVER killed, even if its title matches
16
- # the allowlist. The scan is scoped to the lead's session (`list-panes -s`),
17
- # never the whole server (`-a`).
12
+ # Two invocation shapes:
18
13
  #
19
- # Two modes:
20
- # (default) kill every okstra pane (sources 1+2) and remove the registry
21
- # file. Used by the `SessionEnd` hook, by the lead's per-phase
22
- # auto-clean, and by the lead's end-of-run prompt (yes-branch).
23
- # --list print one line per okstra pane (`<pane_id>\t<pane_title>`) so the
24
- # lead can show the user what would be closed. Empty stdout when
25
- # nothing is tracked.
14
+ # --run-dir <RUN_DIR> Used by the LEAD between phases and at wrap-up. Closes
15
+ # (a) trace panes tagged with this run's dir and
16
+ # (b) worker-agent panes the harness gives to dispatched
17
+ # subagents (`claude-worker` / `codex-worker` /
18
+ # `gemini-worker` / `report-writer-worker`), identified
19
+ # by a title allowlist scoped to the LEAD's session. The
20
+ # lead pane is read from `<RUN_DIR>/state/lead-pane.id`
21
+ # (recorded once by the lead in its own foreground pane —
22
+ # reliable even though Claude Code's Bash tool strips
23
+ # `$TMUX`/`$TMUX_PANE`); it scopes the title scan and is
24
+ # NEVER killed.
26
25
  #
27
- # Failures are tolerated silently — a stale pane id, missing $TMUX, or a locked
28
- # tmux client must never prevent Claude from exiting cleanly.
26
+ # --reap Used by the `SessionEnd` hook, where no single run-dir
27
+ # applies. Closes every trace pane whose tag points under
28
+ # `$CLAUDE_PROJECT_DIR/.okstra/` (or every tagged trace
29
+ # pane if that env var is unset). Harness-owned
30
+ # worker-agent panes are left to the harness.
31
+ #
32
+ # `--list` (alias `--dry-run`) prints `<pane_id>\t<pane_title>` per pane instead
33
+ # of killing — only meaningful with `--run-dir`.
34
+ #
35
+ # Failures are tolerated silently — a stale pane id, no tmux, or a locked tmux
36
+ # client must never prevent Claude from exiting cleanly.
29
37
 
30
38
  set -u
31
39
 
32
- MODE="kill"
33
- case "${1:-}" in
34
- "") MODE="kill" ;;
35
- --list) MODE="list" ;;
36
- --dry-run) MODE="list" ;; # alias
37
- -h|--help)
38
- cat <<'USAGE'
39
- usage: okstra-trace-cleanup.sh [--list]
40
+ MODE="kill" # kill | list
41
+ REAP=0
42
+ run_dir=""
43
+ while [[ $# -gt 0 ]]; do
44
+ case "$1" in
45
+ --list|--dry-run) MODE="list" ;;
46
+ --reap) REAP=1 ;;
47
+ --run-dir) shift; run_dir="${1-}" ;;
48
+ --run-dir=*) run_dir="${1#--run-dir=}" ;;
49
+ -h|--help)
50
+ cat <<'USAGE'
51
+ usage: okstra-trace-cleanup.sh (--run-dir <RUN_DIR> [--list] | --reap)
40
52
 
41
- (no args) kill every okstra pane for $TMUX_PANE (trace + worker-agent);
42
- remove the trace registry file.
43
- --list print "<pane_id>\t<pane_title>" per okstra pane; no kill.
53
+ --run-dir okstra run directory; closes that run's trace + worker-agent panes.
54
+ --list with --run-dir: print "<pane_id>\t<pane_title>" per pane; no kill.
44
55
  --dry-run alias for --list.
56
+ --reap close every okstra trace pane under $CLAUDE_PROJECT_DIR/.okstra
57
+ (SessionEnd hook; no single run-dir applies).
45
58
  USAGE
46
- exit 0 ;;
47
- *)
48
- printf 'okstra-trace-cleanup.sh: unknown option: %s\n' "$1" >&2
49
- exit 2 ;;
50
- esac
59
+ exit 0 ;;
60
+ *)
61
+ printf 'okstra-trace-cleanup.sh: unknown option: %s\n' "$1" >&2
62
+ exit 2 ;;
63
+ esac
64
+ shift
65
+ done
51
66
 
52
- # No tmux pane context nothing to clean / list.
53
- if [[ -z "${TMUX_PANE:-}" ]]; then
54
- exit 0
67
+ if [[ "$REAP" -eq 0 && -z "$run_dir" ]]; then
68
+ printf 'okstra-trace-cleanup.sh: --run-dir <RUN_DIR> (or --reap) is required\n' >&2
69
+ exit 2
55
70
  fi
56
71
 
57
- registry_dir="${TMPDIR:-/tmp}/okstra-trace-panes"
58
- safe_pane="${TMUX_PANE//[^A-Za-z0-9]/_}"
59
- registry_file="$registry_dir/${safe_pane}.list"
72
+ # Canonicalize paths used in tag string-compares. The wrappers tag panes with
73
+ # `pwd -P` (symlink-resolved), so the scope paths must be resolved the same way
74
+ # — else a symlinked component (e.g. macOS /tmp -> /private/tmp) makes the
75
+ # compare miss. Fall back to the literal value if the dir does not resolve.
76
+ _resolve() { (cd "$1" 2>/dev/null && pwd -P) || printf '%s' "$1"; }
77
+ [[ -n "$run_dir" ]] && run_dir="$(_resolve "$run_dir")"
78
+ project_dir=""
79
+ [[ -n "${CLAUDE_PROJECT_DIR:-}" ]] && project_dir="$(_resolve "$CLAUDE_PROJECT_DIR")"
80
+
81
+ # Lead pane. For a run, prefer the value the lead recorded in its own foreground
82
+ # pane; fall back to the active-pane probe. Rejected if the recorded pane is
83
+ # gone. For --reap there is no run state — probe the active pane, used only to
84
+ # avoid killing whatever pane the reap runs from.
85
+ lead_pane=""
86
+ if [[ "$REAP" -eq 0 ]]; then
87
+ lead_pane_file="$run_dir/state/lead-pane.id"
88
+ [[ -r "$lead_pane_file" ]] && lead_pane="$(head -n1 "$lead_pane_file" 2>/dev/null || true)"
89
+ fi
90
+ if [[ -z "$lead_pane" ]] || ! tmux display-message -p -t "$lead_pane" '#{pane_id}' >/dev/null 2>&1; then
91
+ lead_pane="$(tmux display-message -p '#{pane_id}' 2>/dev/null || true)"
92
+ fi
93
+
94
+ # Does a trace pane's tag belong to the set we are closing?
95
+ _tag_in_scope() {
96
+ local tag="$1"
97
+ if [[ "$REAP" -eq 1 ]]; then
98
+ [[ -z "$tag" ]] && return 1
99
+ [[ -n "$project_dir" ]] && { [[ "$tag" == "$project_dir/"* ]]; return; }
100
+ return 0 # no project scope available → reap every tagged trace pane
101
+ fi
102
+ [[ "$tag" == "$run_dir" ]]
103
+ }
60
104
 
61
- # Collect okstra pane ids for the lead session: registered trace panes ∪
62
- # title-allowlisted worker-agent panes, always excluding the lead pane itself.
63
105
  collect_okstra_panes() {
64
106
  local -a panes=()
65
- local pid title
107
+ local pid tag title
108
+
109
+ # (1) Trace panes tagged in scope — found server-wide by tag, so no tmux env
110
+ # var or pane-id registry is needed.
111
+ while IFS=$'\t' read -r pid tag; do
112
+ [[ -n "$pid" ]] || continue
113
+ [[ "$pid" == "$lead_pane" ]] && continue
114
+ _tag_in_scope "$tag" && panes+=("$pid")
115
+ done < <(tmux list-panes -a -F '#{pane_id}'$'\t''#{@okstra_trace_run}' 2>/dev/null || true)
66
116
 
67
- # (1) Registered trace panes scoped to THIS lead's registry only, so
68
- # concurrent Claude instances do not stomp each other's trace panes.
69
- if [[ -f "$registry_file" ]]; then
70
- while IFS= read -r pid; do
117
+ # (2) Title-allowlisted worker-agent panes in the lead's session. Only for a
118
+ # run (reap leaves these harness-owned panes to the harness). `list-panes -s
119
+ # -t <pane>` resolves the session containing that pane, so the scan never
120
+ # reaches other sessions (no `-a`). Skipped when the lead pane is unknown.
121
+ if [[ "$REAP" -eq 0 && -n "$lead_pane" ]]; then
122
+ while IFS=$'\t' read -r pid title; do
71
123
  [[ -n "$pid" ]] || continue
72
- [[ "$pid" == "$TMUX_PANE" ]] && continue
73
- panes+=("$pid")
74
- done < "$registry_file"
124
+ [[ "$pid" == "$lead_pane" ]] && continue
125
+ case "$title" in
126
+ *claude-worker*|*codex-worker*|*gemini-worker*|*report-writer-worker*)
127
+ panes+=("$pid") ;;
128
+ esac
129
+ done < <(tmux list-panes -s -t "$lead_pane" \
130
+ -F '#{pane_id}'$'\t''#{pane_title}' 2>/dev/null || true)
75
131
  fi
76
132
 
77
- # (2) Title-allowlisted worker-agent panes in the lead's session.
78
- # `list-panes -s -t <pane>` resolves the session containing that pane, so the
79
- # scan never reaches other sessions (no `-a`).
80
- while IFS=$'\t' read -r pid title; do
81
- [[ -n "$pid" ]] || continue
82
- [[ "$pid" == "$TMUX_PANE" ]] && continue
83
- case "$title" in
84
- *claude-worker*|*codex-worker*|*gemini-worker*|*report-writer-worker*)
85
- panes+=("$pid") ;;
86
- esac
87
- done < <(tmux list-panes -s -t "$TMUX_PANE" \
88
- -F '#{pane_id}'$'\t''#{pane_title}' 2>/dev/null || true)
89
-
90
- # Dedupe — a live trace pane can match both the registry and the title scan.
133
+ # Dedupe a live trace pane can match both the tag scan and the title scan.
91
134
  if (( ${#panes[@]} )); then
92
135
  printf '%s\n' "${panes[@]}" | awk 'NF && !seen[$0]++'
93
136
  fi
@@ -109,5 +152,4 @@ while IFS= read -r pane_id; do
109
152
  tmux kill-pane -t "$pane_id" 2>/dev/null || true
110
153
  done < <(collect_okstra_panes)
111
154
 
112
- rm -f "$registry_file" 2>/dev/null || true
113
155
  exit 0
@@ -29,22 +29,33 @@ profile document.
29
29
  - This rule does NOT relax any phase-specific Forbidden actions list; safety rules in the per-profile document remain in force regardless of the user's authority.
30
30
  - Anti-escalation rule (shared):
31
31
  - treating "다음 단계 진행해" or equivalent user phrases as authorisation to start a *different* lifecycle phase is forbidden. The next phase begins only in a separate okstra run launched with the new `--task-type`. Per-profile documents may further restrict this within their own scope.
32
+ - Run-start pane recording (shared — runs ONCE at run start, before the FIRST worker dispatch):
33
+ - The wrappers anchor each trace pane to the lead's pane and the cleanup scopes the worker-agent scan to it, but Claude Code's Bash tool strips `$TMUX`/`$TMUX_PANE`, so the lead MUST record its own pane explicitly. Because the lead runs this in its OWN foreground pane, the active pane IS the lead's — reliable, unlike a backgrounded wrapper's later probe.
34
+ - The lead MUST run once, at run start: `mkdir -p "<RUN_DIR>/state" && tmux display-message -p '#{pane_id}' > "<RUN_DIR>/state/lead-pane.id" 2>/dev/null || true` (substitute the run's absolute `RUN_DIR`). Outside tmux this writes nothing and every pane step below silently no-ops — that empty/absent file is the single signal that the lead is not in tmux.
32
35
  - Phase-start pane reset (shared — runs BEFORE dispatching each new worker batch):
33
- - okstra creates two kinds of tmux pane per run: (a) **worker-agent panes** the harness gives to dispatched subagents (titled `claude-worker` / `codex-worker` / `gemini-worker` / `report-writer-worker`), and (b) **trace panes** the codex/gemini wrappers spawn (`<cli>-<role>-<pid>-trace`). Both accumulate across internal phases because each new phase dispatches a fresh worker batch and the prior panes are never reclaimed.
34
- - When `$TMUX_PANE` is set, the lead MUST run `$HOME/.okstra/bin/okstra-trace-cleanup.sh` (no args) **immediately before** dispatching the next phase's workers — i.e. just before emitting each `PROGRESS: phase-5.5-convergence round=<N>` marker and just before `PROGRESS: phase-6-synthesis dispatching report-writer-worker`. This closes every prior-phase okstra pane (worker-agent + trace) for the lead session, while NEVER killing the lead's own pane.
35
- - This is **automatic and silent** — NO user prompt. Report it in one short line (e.g. `이전 phase okstra pane 3개 정리`) and proceed. It is silent-skipped when `$TMUX_PANE` is unset; the lead MUST NOT fabricate a synthetic pane list in that case.
36
+ - okstra creates two kinds of tmux pane per run: (a) **worker-agent panes** the harness gives to dispatched subagents (titled `claude-worker` / `codex-worker` / `gemini-worker` / `report-writer-worker`), and (b) **trace panes** the codex/gemini wrappers spawn (`<cli>-<role>-<pid>-tail`). Both accumulate across internal phases because each new phase dispatches a fresh worker batch and the prior panes are never reclaimed.
37
+ - When `<RUN_DIR>/state/lead-pane.id` is non-empty (the lead is in tmux), the lead MUST run `$HOME/.okstra/bin/okstra-trace-cleanup.sh --run-dir "<RUN_DIR>"` **immediately before** dispatching the next phase's workers — i.e. just before emitting each `PROGRESS: phase-5.5-convergence round=<N>` marker and just before `PROGRESS: phase-6-synthesis dispatching report-writer-worker`. This closes every prior-phase okstra pane (worker-agent + trace) for this run, while NEVER killing the lead's own pane.
38
+ - This is **automatic and silent** — NO user prompt. Report it in one short line (e.g. `이전 phase okstra pane 3개 정리`) and proceed. It is silent-skipped when the lead is not in tmux; the lead MUST NOT fabricate a synthetic pane list in that case.
39
+ - Run-end team teardown (shared — runs AFTER Phase 7 persistence/token collection, BEFORE the pane disposition step below):
40
+ - The lead created the worker team in Phase 3 (`TeamCreate(team_name: "okstra-<task-key>")`). Worker teammates are NOT reclaimed on their own — without an explicit teardown they linger in the FleetView roster across this and later runs in the session. The lead MUST release them once the run's work is done.
41
+ - This step is **automatic and silent** — NO user prompt (workers are idle sessions that have already delivered their results; there is nothing for the user to preserve). It runs only when team-state's `teamCreate.status == "ok"` (Teams mode was actually used); in the no-`team_name` fallback there is no team to delete, so silent-skip.
42
+ - Sequence (token-usage collection MUST already be complete — `TeamDelete` removes `~/.claude/teams/<team>/` + `~/.claude/tasks/<team>/` but NOT the `~/.claude/projects/` jsonls Phase 7 reads, yet the read MUST precede teardown):
43
+ 1. Read `~/.claude/teams/okstra-<task-key>/config.json` and, for every `members` entry whose name is not the lead, `SendMessage(to: <name>, message: { type: "shutdown_request" })` to terminate it gracefully.
44
+ 2. Wait for the shutdown confirmations / idle notifications from all addressed teammates.
45
+ 3. Call `TeamDelete()`. If it errors with an active-members message, a teammate has not finished shutting down — wait briefly and retry `TeamDelete()` once.
46
+ - Report it in one short line (e.g. `worker 6명 종료 + 팀 해제`) and proceed. Emit `PROGRESS: phase-7-teardown disbanding team` immediately before step 1.
36
47
  - Phase wrap-up — okstra pane disposition (shared, MUST be the *last* step before returning control to the user):
37
- - At run end the only residual okstra panes are the LAST phase's (e.g. the `report-writer-worker` agent pane and any codex/gemini trace pane). `okstra-trace-cleanup.sh --list` returns one tab-separated `<pane_id>\t<pane_title>` line per residual okstra pane (worker-agent + trace) for this lead session.
38
- - When `$TMUX_PANE` is set, after the final-report file has been written and the routing recommendation has been issued, the lead MUST run `$HOME/.okstra/bin/okstra-trace-cleanup.sh --list` exactly once. The output lists every residual okstra pane (worker-agent + trace) for this Claude session, never the lead's own pane.
48
+ - At run end the only residual okstra panes are the LAST phase's (e.g. the `report-writer-worker` agent pane and any codex/gemini trace pane). `okstra-trace-cleanup.sh --list --run-dir "<RUN_DIR>"` returns one tab-separated `<pane_id>\t<pane_title>` line per residual okstra pane (worker-agent + trace) for this run.
49
+ - When `<RUN_DIR>/state/lead-pane.id` is non-empty, after the final-report file has been written and the routing recommendation has been issued, the lead MUST run `$HOME/.okstra/bin/okstra-trace-cleanup.sh --list --run-dir "<RUN_DIR>"` exactly once. The output lists every residual okstra pane (worker-agent + trace) for this run, never the lead's own pane.
39
50
  - If the list is empty, skip the question — there is nothing to ask about (the phase-start resets above usually already cleared prior phases).
40
51
  - Otherwise the lead MUST present the user with a strict binary choice **before** declaring the phase complete. Use one prompt of this shape (Korean preferred, English acceptable if the rest of the run is in English):
41
52
  > 현재 phase 종료 시점입니다. 다음 okstra pane 이 열려 있습니다 — 닫을까요?
42
53
  > <인용된 `--list` 출력>
43
54
  > (예) 모두 닫기 / (아니오) 그대로 두기
44
- - On `예` / `y` / `close` → run `$HOME/.okstra/bin/okstra-trace-cleanup.sh` (no args) and report the kill count back in one sentence.
45
- - On `아니오` / `n` / `keep` → leave the panes intact; remind the user that they will be cleaned up automatically when Claude `/exit` fires the `SessionEnd` hook.
55
+ - On `예` / `y` / `close` → run `$HOME/.okstra/bin/okstra-trace-cleanup.sh --run-dir "<RUN_DIR>"` and report the kill count back in one sentence.
56
+ - On `아니오` / `n` / `keep` → leave the panes intact; remind the user that they will be cleaned up automatically when Claude `/exit` fires the `SessionEnd` hook (`--reap`).
46
57
  - The question MUST be a clean yes/no — do NOT offer "close some / keep some" partial answers, do NOT propose alternatives like "close only codex panes". The whole-set decision keeps the wrap-up predictable.
47
- - This step is mandatory for every phase (`requirements-discovery`, `error-analysis`, `implementation-planning`, `implementation`, `final-verification`, `release-handoff`). It is silent-skipped when `$TMUX_PANE` is unset (lead running outside tmux); the lead MUST NOT fabricate a synthetic pane list in that case.
58
+ - This step is mandatory for every phase (`requirements-discovery`, `error-analysis`, `implementation-planning`, `implementation`, `final-verification`, `release-handoff`). It is silent-skipped when `<RUN_DIR>/state/lead-pane.id` is empty/absent (lead running outside tmux); the lead MUST NOT fabricate a synthetic pane list in that case.
48
59
  - Brief handoff contract (shared — applies whenever the run consumes a task brief produced by `okstra-brief`):
49
60
  - the brief is a **pre-discovery artifact**: it converts a domain-reporter's words (non-expert *or* developer) into expert-consumable form so this and later phases can run with zero fill-in questions to the operator. The brief is **not** authoritative on solution decisions; it is authoritative on the reporter's intent.
50
61
  - **Reporter confirmation precondition (BLOCKING)**: the brief's frontmatter carries `reporter-confirmations: <complete | partial | pending | skipped>` set by `okstra-brief` Step 6.5. Every phase that consumes the brief MUST read this field before doing analysis. The handling matrix is:
@@ -26,6 +26,7 @@ until Phase 5 ends, then drop from active context for Phase 6/7.
26
26
  - Order of operations per plan step: (1) write/extend the test that captures the step's acceptance criterion and confirm it fails for the right reason, (2) commit the failing test (`test(<scope>): ...`), (3) implement the minimum change to make it pass, (4) commit the implementation (`feat|fix(<scope>): ...`), (5) refactor without changing behaviour and commit separately if any cleanup is made (`refactor(<scope>): ...`). The failing-then-passing transition between steps (2) and (4) is the `TDD evidence` required by the final report.
27
27
  - Doc-only / config-only / pure-rename steps that have no observable runtime behaviour are exempt from the failing-test requirement, but the executor MUST cite the exemption per step in the final report (`TDD exemption: <reason>`).
28
28
  - When the touched area has no existing test harness, the executor MUST stand up the minimum harness needed to host one regression test for this run rather than skipping TDD entirely. Record the harness-bootstrap step as an `Out-of-plan edit` if it is not in the plan.
29
+ - **DB / IO / SQL changes require real execution — mock-only is NOT validation evidence:** when this run's diff touches DB/IO/SQL (ORM / query-builder code — sequelize / typeorm / prisma / knex / raw SQL — `*.repository.*`, model/entity files, `migrations/**`, `*.sql`, or any changed query string), a mocked unit test cannot observe the SQL the query builder actually emits — a mocked suite once passed while `count({ col: 'FontFamily.fontFamily' })` threw `Unknown column` on the real DB. The executor MUST run the change against a real (or faithful-replica) datastore — the `db-test` validation step (plan `validation` db step, else `project.json.qaCommands.db-test`), targeting a **local / replica** DB — and cite its exact command + exit code in the final report's `Validation evidence`. If no real DB / `db-test` command is reachable, do NOT claim the change verified: label the DB portion `정적 분석상 …, 미검증(실행 안 함)` in the report, surface it in the routing recommendation, and never downplay the real run as "too heavy". `git push` stays forbidden (universal list); the unverified DB state is carried forward so `final-verification` cannot accept it and `release-handoff` cannot push.
29
30
  - re-read the approved plan end-to-end and parse the `## 4.5 Stage Map`. Determine **start stage**:
30
31
  - if `--stage <N>` is supplied, use N. Otherwise auto = the lowest stage number whose `depends-on` are all recorded as `status:done` in `runs/<plan-key>/consumers.jsonl` AND that itself has no `status:done` row. Multiple stages may match — two parallel `implementation` runs may pick different ones and proceed concurrently.
31
32
  - load every `runs/<plan-key>/carry/stage-<i>.json` for `i ∈ depends-on(start_stage)` and inject them into the executor's working context as "runtime carry-in". For `depends-on (none)` stages, no sidecar load — task-brief only.
@@ -30,7 +30,8 @@ Verifier obtains the QA command set from exactly two declared sources, in order
30
30
  "lint": [{ "label": "cargo clippy", "cmd": "cargo clippy --all-targets -- -D warnings", "language": "rust" }],
31
31
  "format": [{ "label": "cargo fmt", "cmd": "cargo fmt --check", "language": "rust" }],
32
32
  "typecheck": [{ "label": "tsc", "cmd": "pnpm exec tsc --noEmit", "language": "ts" }],
33
- "test": [{ "label": "cargo test", "cmd": "cargo test --workspace --locked", "language": "rust" }]
33
+ "test": [{ "label": "cargo test", "cmd": "cargo test --workspace --locked", "language": "rust" }],
34
+ "db-test": [{ "label": "db integ", "cmd": "pnpm test:db", "language": "ts" }]
34
35
  }
35
36
  }
36
37
  ```
@@ -42,7 +43,7 @@ Tier 1 commands run verbatim first. Then every Tier 2 entry runs once. Each comm
42
43
 
43
44
  ### Missing-tier handling
44
45
 
45
- If a tier is empty or absent, verifier records the single line `qa-command not configured: <category>` per missing category (`lint` / `format` / `typecheck` / `test`) in the worker result and proceeds — silent omission is a contract violation. Verifier MUST NOT auto-detect or invent a command in this case; the user/operator must declare it in `project.json.qaCommands` or in the plan.
46
+ If a tier is empty or absent, verifier records the single line `qa-command not configured: <category>` per missing category (`lint` / `format` / `typecheck` / `test`; and `db-test` **only when the diff touches DB/IO/SQL**, where a missing `db-test` is escalated to a blocking finding per the DB real-execution gate below — not a passive note) in the worker result and proceeds — silent omission is a contract violation. Verifier MUST NOT auto-detect or invent a command in this case; the user/operator must declare it in `project.json.qaCommands` or in the plan.
46
47
 
47
48
  ### `cmd` field deny-list (Tier 2 validation)
48
49
 
@@ -74,6 +75,16 @@ Re-running commands proves the diff *builds and passes*; it does NOT prove the d
74
75
  - **Advisory findings (recorded as recommendations; verdict MAY still PASS):** function >50 effective lines, a single body mixing read+write stages, weak readability, a missing-but-non-critical outcome assertion. These land in the verifier result as `should-fix` / `nit` recommendations, not as a `FAIL`.
75
76
  - **Output.** Every finding — blocking or advisory — is a structured item in the verifier's worker result (`path:line`, rule, severity, suggested fix) so it carries into Phase 5.5 convergence and the final report. A blocking hit sets the verifier verdict to `FAIL` with the rule cited, using the same verdict machinery as the Discrepancy rule above. `Claude lead` MUST NOT silently downgrade a cited blocking finding to advisory during synthesis; an override requires a concrete cited reason, exactly as for the Discrepancy rule.
76
77
 
78
+ ### DB / IO / SQL change — real-execution gate (mock-only acceptance forbidden)
79
+
80
+ A mocked unit test cannot observe the SQL a query builder actually emits — `count({ col: 'FontFamily.fontFamily' })` passes a mocked suite yet throws `Unknown column` on a real database. For this class of change a green mock-only suite is therefore NOT evidence; only a run against a real (or faithful-replica) datastore is. This gate is the verifier's enforcement of that rule.
81
+
82
+ - **Trigger.** Fires when `git diff <base>...HEAD` touches DB/IO/SQL: ORM / query-builder code (sequelize / typeorm / prisma / knex / raw SQL), `*.repository.*`, model/entity files, `migrations/**`, `*.sql`, or any changed query string.
83
+ - **Requirement when fired.** The verifier MUST reproduce a real-DB execution: run the `db-test` tier (Tier 1 = plan `validation` db step; else Tier 2 = `project.json.qaCommands.db-test`) against a **local / replica** datastore (same engine + schema — never shared / staging / prod, consistent with the verifier forbidden-actions list) and record its exact command + exit code. A mock, an in-memory shim that does not parse real SQL, or static reasoning does NOT satisfy this.
84
+ - **No `db-test` command available → blocking, not a passive skip.** If neither tier declares a `db-test` command, the verifier records the blocking finding `db-test not configured — DB change unverified (mock-only)` and sets the verdict to `FAIL`; it MUST NOT emit only the passive `qa-command not configured` note and pass. Recommended fix: declare a `db-test` command in `project.json.qaCommands` or the plan's validation set.
85
+ - **Mock-only evidence → unverified.** If the diff's only DB coverage is mocked, the verifier labels the DB portion `정적 분석상 …, 미검증(실행 안 함)` (never `검증됨`), records it as a blocking finding, and sets `FAIL`. Never downplay the real run as "too heavy / static proof suffices".
86
+ - **Surface it at every layer.** The finding is copied verbatim into the verifier result and MUST survive into the final report's `## 1.` and Verdict Card, so the user sees the DB-unverified state continuously — it is the load-bearing reason a downstream `final-verification` cannot reach `accepted` and `release-handoff` cannot push.
87
+
77
88
  ## All-verifier-failure policy
78
89
 
79
90
  If every verifier present in the resolved roster (`Claude verifier`, `Codex verifier`, and `Gemini verifier` when opted in) ends with a non-result terminal status (`timeout`, `error`, `not-run`) — i.e. zero independent verdicts were produced — the run MUST end with status `blocked` and route to a follow-up `error-analysis` run. `Claude lead` MUST NOT substitute its own verdict in place of the missing verifier outputs; synthesis requires at least one independent verifier's verdict. If one or more verifiers fail but at least one returns a verdict, the run proceeds with the surviving verdict(s) and the final report MUST explicitly notate which verifiers were unavailable, with the captured error / timeout evidence per failed verifier.
@@ -14,6 +14,7 @@
14
14
  - delivered artifacts match recorded expected values in `reference-expectations` (config files, deployment manifests, other recorded expected states); when reference-expectations are absent, record it as missing information rather than assuming a match
15
15
  - test & validation suite pass status — independently re-run the read-only two-tier command set (Tier 1 = brief/approved-plan `validation`, Tier 2 = `project.json` `qaCommands`) and confirm each passes on the verified head, citing exact command + exit code
16
16
  - test correctness — delivered tests actually assert the intended behaviour: no gutted/weakened assertions, no tautological or always-passing tests, no tests exercising only mocks; new behaviour has matching coverage
17
+ - DB / IO / SQL real-execution evidence — when the diff touches DB/IO/SQL (ORM / query-builder, `*.repository.*`, model / `migrations/**` / `*.sql`, or changed query strings), Validation Evidence MUST cite a real (or faithful-replica) DB execution — the `db-test` command + exit code — not a mock-only suite, because a mocked suite cannot observe the SQL actually emitted (`count({ col: 'FontFamily.fontFamily' })` passed mocks yet threw `Unknown column` on the real DB). A DB-touching change whose only evidence is mocked, or for which no `db-test` ran, is an **Acceptance Blocker** (`major`+): record it, and since `accepted` requires zero blockers the verdict becomes `conditional-accept` / `blocked`. This is the gate that stops an unverified DB change from reaching `release-handoff` and being pushed.
17
18
  - no new defects introduced — the diff does not break previously-working behaviour and adds no new bug (logic/off-by-one, null/empty handling, resource leaks, broken error paths)
18
19
  - scope conformance — the delivered diff stays within the approved plan's scope; flag out-of-scope edits, unrelated file changes, leftover debug/commented-out code, and unintended deletions
19
20
  - Residual-tracked — note as Residual Risk unless severe enough to block:
@@ -20,7 +20,11 @@ import re
20
20
  from typing import Iterable
21
21
 
22
22
  # 카테고리 화이트리스트. 알 수 없는 카테고리는 오타 가능성이 높으므로 거부.
23
- ALLOWED_CATEGORIES: tuple[str, ...] = ("lint", "format", "typecheck", "test")
23
+ # `db-test` DB/IO/SQL 변경의 실제 DB(또는 충실한 복제) 실행 테스트 전용 카테고리 —
24
+ # mocked 단위테스트로는 query builder 가 실제로 emit 하는 SQL 을 관측할 수 없으므로
25
+ # `test` 와 분리한다. implementation verifier / final-verification 의 DB 실제실행 게이트가
26
+ # diff 가 DB 를 건드릴 때 이 카테고리(또는 plan validation 의 db 스텝)를 요구한다.
27
+ ALLOWED_CATEGORIES: tuple[str, ...] = ("lint", "format", "typecheck", "test", "db-test")
24
28
 
25
29
  # Mutation 을 유발하거나 lockfile 을 갱신하는 토큰. 각 토큰은 `cmd` 문자열을
26
30
  # 공백으로 단순 분해한 결과 또는 부분 일치 패턴(prefix/suffix sensitive) 로 검출한다.
@@ -127,6 +127,8 @@ The four steps below MUST execute in this exact order. Reordering them is the re
127
127
 
128
128
  The status file is written after step 3 completes.
129
129
 
130
+ **Run-end team teardown follows this whole sequence.** Token-usage collection (step 1) reads the worker session jsonls, so the lead MUST NOT disband the team until every step above is done. Only then does the lead shut down worker teammates + `TeamDelete` per `_common-contract.md` "Run-end team teardown" (Teams mode only; silent-skip in the no-`team_name` fallback).
131
+
130
132
  ## Final Report Structure
131
133
 
132
134
  The final report follows the structure encoded in `schemas/final-report-v1.0.schema.json`. The schema is the single source of truth for section names, row shapes, enum values, and task-type-conditional blocks. The Jinja2 template `templates/reports/final-report.template.md` produces the human-readable form from any data.json that validates against the schema. The structure description below is a reading guide for writers; the schema is the binding contract.
@@ -153,7 +153,7 @@
153
153
  "hooks": [
154
154
  {
155
155
  "type": "command",
156
- "command": "$HOME/.okstra/bin/okstra-trace-cleanup.sh"
156
+ "command": "$HOME/.okstra/bin/okstra-trace-cleanup.sh --reap"
157
157
  }
158
158
  ]
159
159
  }