npm - okstra - Versions diffs - 0.14.0 → 0.14.2 - Mend

okstra 0.14.0 → 0.14.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (12) hide show

package/docs/kr/architecture.md +1 -0
package/docs/kr/cli.md +2 -1
package/package.json +1 -1
package/runtime/BUILD.json +2 -2
package/runtime/agents/workers/claude-worker.md +1 -0
package/runtime/agents/workers/codex-worker.md +34 -15
package/runtime/agents/workers/gemini-worker.md +34 -15
package/runtime/bin/okstra-codex-exec.sh +25 -6
package/runtime/bin/okstra-gemini-exec.sh +25 -6
package/runtime/prompts/profiles/implementation.md +1 -0
package/runtime/skills/okstra-team-contract/SKILL.md +6 -0
package/runtime/templates/reports/settings.template.json +2 -0

package/docs/kr/architecture.md CHANGED Viewed

@@ -853,6 +853,7 @@ Claude가 작성하는 최종 보고서는 brief에 더 구체적인 형식이
 - standard workflow는 `Claude lead` + required worker `Claude worker`, `Codex worker`, `Gemini worker`, `Report writer worker`를 사용합니다.
 - worker 모델은 `--lead-model`, `--claude-model`, `--codex-model`, `--gemini-model`, `--report-writer-model`로 override할 수 있고, 기본값은 `OKSTRA_DEFAULT_*` 환경 변수에서 중앙 관리합니다. fallback 기본값은 `Claude lead`/`Report writer worker`=`opus`, `Claude worker`=`sonnet`, `Codex worker`=`gpt-5.5`, `Gemini worker`=`auto`입니다.
 - `--task-type implementation` 에서는 Executor 역할을 맡을 provider 를 `--executor <claude|codex|gemini>` (또는 `OKSTRA_DEFAULT_EXECUTOR`, fallback `claude`) 로 선택합니다. Executor 만 프로젝트 파일을 mutate 할 수 있고, 나머지 두 provider 와 자기 자신의 provider 가 모두 별도 CLI 세션으로 verifier 로 dispatch 됩니다 (세션 분리만으로도 self-review 안전장치 유지). Executor 의 모델은 선택된 provider 의 worker 모델 플래그(`--claude-model` / `--codex-model` / `--gemini-model`) 를 그대로 재사용하며, run-manifest 의 `teamContract.executor` 블록에 provider / displayName / workerAgent / model 이 기록됩니다.
+- Executor 별 worktree cwd 주입: codex / gemini executor 는 wrapper(`okstra-codex-exec.sh -C` / `okstra-gemini-exec.sh --include-directories`) 가 CLI layer 에서 cwd 를 worktree 로 고정합니다. Claude executor 는 Bash tool 에 per-call cwd 인자가 없어 cwd 민감 toolchain (`cargo`/`npm`/`pnpm`/`bun`/`pytest`/`make`/`go`) 호출을 같은 Bash invocation 안에서 `cd {{EXECUTOR_WORKTREE_PATH}} && <cmd>` 로 prefix 합니다 — `bash -lc`/`bash -c` 래핑은 금지되며 (`cd` leading token 이 가려져 permission auto-allow 우회 실패), 작업 디렉터리 플래그 (`git -C`, `cargo --manifest-path` 등) 가 있으면 그것을 우선합니다. 자세한 규약은 `prompts/profiles/implementation.md` 의 *Executor Worktree* 블록과 `agents/workers/claude-worker.md` 의 Executor exception 항목 참고.
 - project-level current-task convenience pointer는 `.project-docs/okstra/discovery/latest-task.json`입니다.
 - project-level canonical task inventory는 `.project-docs/okstra/discovery/task-catalog.json`입니다.
 - project-local okstra Claude asset은 `.claude/skills/`와 `.claude/agents/` 아래에 seed되며, 기본 rerun에서는 보존되고 `--refresh-assets`로 다시 생성할 수 있습니다.

package/docs/kr/cli.md CHANGED Viewed

@@ -287,7 +287,8 @@ fallback 기본값은 아래와 같습니다.
 - Executor 는 이 run 에서 **유일하게 프로젝트 파일을 mutate 할 수 있는 worker** 입니다. 나머지 두 provider 는 같은 run 에서 strict read-only verifier 로 dispatch 됩니다.
 - Executor 의 모델은 provider 별 worker 모델 플래그를 그대로 재사용합니다. 즉 `--executor codex` 이면 Executor 의 모델은 `--codex-model` (기본 `gpt-5.5`), `--executor gemini` 이면 `--gemini-model` (기본 `auto`) 가 됩니다.
 - Claude/Codex/Gemini 세 verifier 는 executor provider 와 관계없이 항상 dispatch 됩니다. Executor 와 같은 provider 라도 별도 CLI 세션으로 verifier 가 호출되어 context 가 분리되므로 self-review 안전장치는 유지됩니다.
-- 실제 파일 변경은 Codex/Gemini 의 경우 각 CLI 의 auto-edit 모드 (예: `codex exec --sandbox workspace-write`) 를 통해 일어나며, Claude-side Edit/Write tool 을 거치지 않습니다.
+- 실제 파일 변경은 Codex/Gemini 의 경우 각 CLI 의 auto-edit 모드 (예: `codex exec --sandbox workspace-write`) 를 통해 일어나며, Claude-side Edit/Write tool 을 거치지 않습니다. implementation phase 에서 worker 가 mutation 을 수행하는 위치는 다음 항목에서 설명하는 task worktree 이며, 양쪽 wrapper(`scripts/okstra-codex-exec.sh`, `scripts/okstra-gemini-exec.sh`) 는 worktree 경로를 4번째 positional 인자로 받아 codex 는 `--add-dir`, gemini 는 `--include-directories` 로 forward 합니다. 누락하면 codex `workspace-write` 샌드박스가 worktree 쓰기를 EPERM 으로 거부합니다.
+- **Claude executor 의 cwd 처리**: Claude Bash tool 은 per-call cwd 인자를 받지 않고 lead session 의 cwd 를 상속하므로, cwd 에 민감한 toolchain (`cargo`, `npm`, `pnpm`, `bun`, `pytest`, `make`, `go` 등) 을 worktree 안에서 실행하려면 호출을 `cd {{EXECUTOR_WORKTREE_PATH}} && <cmd>` 로 prefix 해야 합니다. 단일 Bash 호출 안에서 `cd` 가 leading token 으로 남아야 Claude Code 의 permission auto-allow 가 정상 동작하므로 `bash -lc "..."` / `bash -c "..."` 로 감싸지 않습니다 (감싸면 `cd` 가 가려져 매 호출마다 permission prompt 가 발생). `git -C <path>`, `cargo --manifest-path`, `pytest --rootdir` 처럼 작업 디렉터리 플래그를 받는 도구는 `cd && ` chain 대신 해당 플래그를 우선 사용합니다. Edit/Write/Read tool 은 이미 절대경로를 사용하므로 별도 cwd 처리가 필요 없습니다. 이 규칙은 Claude executor 에만 적용되고 codex / gemini executor 는 CLI wrapper 가 cwd 를 주입합니다.
 - **Task worktree (모든 task-type 자동 격리)**: 모든 task-type 의 첫 번째 phase prepare 단계에서 `okstra-ctl` 이 `~/.okstra/worktrees/<project-id>/<task-group-segment>/<task-id-segment>/` 에 `git worktree` 를 생성하고, 브랜치 `<work-category-prefix>-<task-id-segment>` 를 main worktree `HEAD` 에서 분기합니다. 같은 task-key 의 이후 phase 는 동일한 path/branch 를 재사용하므로 status 가 `reused` 로 기록됩니다 (run-prep 시점에 새 `git worktree add` 가 일어나지 않음). 모든 segment 의 `/`·`:` 등 특수문자는 `-` 로 정규화되며, `~/.okstra/worktrees/registry.json` 가 task-key → path/branch 매핑을 전역 관리합니다 (flock-guarded). Executor 의 Edit/Write/build/test/commit, verifier 의 read 는 모두 이 worktree 안에서 수행됩니다. caller 가 이미 다른 worktree 안에 있거나 project_root 가 git repo 가 아니면 provisioning 은 skip 되고 status 가 `skipped-in-worktree` / `skipped-not-git` 로 기록됩니다. 경로·브랜치 충돌은 `PrepareError` 로 즉시 실패시키며, run 종료 후 worktree 는 자동 삭제하지 않습니다 (수동: `git worktree remove` → `git branch -D` + registry 항목 삭제).
 예:

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "okstra",
-  "version": "0.14.0",
+  "version": "0.14.2",
   "description": "Multi-agent cross-verification orchestrator runtime + Claude Code skills.",
   "license": "MIT",
   "author": "devonshin",

package/runtime/BUILD.json CHANGED Viewed

@@ -1,5 +1,5 @@
 {
-  "package": "0.14.0",
-  "builtAt": "2026-05-13T02:55:11.516Z",
+  "package": "0.14.2",
+  "builtAt": "2026-05-13T03:43:46.461Z",
   "repoRoot": "/home/runner/work/okstra/okstra"
 }

package/runtime/agents/workers/claude-worker.md CHANGED Viewed

@@ -42,6 +42,7 @@ Unlike the Codex / Gemini workers, you are an in-process Claude subagent — you
    - If the parent directory does not exist yet, create it before writing.
 4. Anchor all file operations to the absolute `Project Root` from the lead prompt. Use absolute paths — do NOT rely on inherited cwd. Never use `cd` to change directory.
+   - **Executor exception (implementation phase only):** when this worker is dispatched as the `Executor` and the lead prompt provides an `EXECUTOR_WORKTREE_PATH` that differs from the session's inherited cwd, cwd-sensitive Bash commands (`cargo *`, `npm *`, `pnpm *`, `bun *`, `pytest`, `make *`, `go *`, language-toolchain test/build commands) MUST be prefixed with `cd <EXECUTOR_WORKTREE_PATH> && ` in the same Bash invocation — e.g. `cd /Users/.../worktrees/foo && cargo test -p bar`. Do NOT wrap the whole thing in `bash -lc "..."` or `bash -c "..."`; pass the chained command directly to the Bash tool so the leading `cd` token remains visible to the permission layer. The `cd` is scoped to the single Bash subshell and does not mutate the session's shell state, so this does not conflict with the "never use cd" rule above (which prevents the worker from drifting the session cwd across calls). Verifier roles do NOT use this exception — they read with absolute paths only.
 5. **MCP usage**: The canonical list of MCP servers and tools available for this run lives in the lead prompt's `## Available MCP Servers` section (sourced from `.project-docs/okstra/project.json`'s `mcpServers` array). When the task requires inspection of an external system covered by one of those servers, call the listed tool directly by name (e.g. `mcp__<server>__<tool>`). Do NOT shell out via `claude --mcp-cli call ...` or run the tool name as a Bash command — those are not valid invocation paths. If a server you need is not listed, record `MCP not available for this run` in your worker output rather than guessing a tool name.

package/runtime/agents/workers/codex-worker.md CHANGED Viewed

@@ -18,7 +18,7 @@ description: |
   </example>
 model: inherit
 color: cyan
-tools: ["Bash", "Read", "Write", "Glob", "Grep"]
+tools: ["Bash", "BashOutput", "KillShell", "Read", "Write", "Glob", "Grep"]
 ---
 You are a Codex worker agent. Your job is to execute the OpenAI Codex CLI and return the analysis result.
@@ -27,12 +27,14 @@ You are a Codex worker agent. Your job is to execute the OpenAI Codex CLI and re
 **Required form (uses the okstra wrapper to avoid redirect-triggered permission prompts):**
 ```bash
-$HOME/.okstra/bin/okstra-codex-exec.sh "<absolute-project-root>" "<assigned-model-execution-value>" "<absolute-prompt-history-path>"
+$HOME/.okstra/bin/okstra-codex-exec.sh "<absolute-project-root>" "<assigned-model-execution-value>" "<absolute-prompt-history-path>" [<absolute-worktree-path>]
 ```
+The fourth argument is **mandatory for implementation phase** and optional otherwise. It must be the literal `EXECUTOR_WORKTREE_PATH` recorded in the run context; the wrapper forwards it to codex as `--add-dir`, which grants the codex sandbox write access to the worktree (where all implementation-phase mutations occur). Without it, codex's `workspace-write` sandbox is anchored only at `<project-root>` and rejects every Edit/Write that targets the worktree (EPERM), which is the failure pattern that originally motivated this argument.
 The wrapper internally runs:
 ```bash
-codex exec -C "<project-root>" --model "<model>" --sandbox workspace-write - < "<prompt-path>" 2>/dev/null
+codex exec -C "<project-root>" [--add-dir "<worktree-path>"] --model "<model>" --sandbox workspace-write - < "<prompt-path>" 2>/dev/null
 ```
 The wrapper exists because Claude Code's Bash permission matcher rejects simple-prefix matches when the command contains stdin/stderr redirects. Calling `codex exec ... - < <path> 2>/dev/null` directly triggers a permission prompt every dispatch even when `Bash(codex exec:*)` is allowlisted. The wrapper folds the redirects inside, so the harness sees a single non-redirect command that matches `Bash($HOME/.okstra/bin/okstra-codex-exec.sh:*)`.
@@ -67,20 +69,34 @@ The wrapper exists because Claude Code's Bash permission matcher rejects simple-
    - If neither is available, immediately return `CODEX_MODEL_MISSING: assigned Codex model execution value was not provided`. Do NOT fall back to training-data defaults — historical codex defaults like `o4-mini` are NOT acceptable substitutes for the assigned model. Returning the sentinel is the correct behavior; the lead is responsible for fixing its prompt and redispatching.
    - This rule applies equally to convergence reverify rounds. The reverify prompt MUST carry the same `**Model:**` line as the initial run (see `okstra-convergence` skill, "Required reverify-prompt anchor headers"). If the line is absent in a reverify prompt, return `CODEX_MODEL_MISSING` rather than guessing.
-7. If installed, execute the prompt through the wrapper as a single command with a Bash timeout of 120000 ms. Pass the three positional arguments verbatim — do NOT use environment variables, `cd`, `&&` chains, or pipes from `cat`:
+7. If installed, dispatch the wrapper as a **background** Bash command and poll for completion. The two-minute foreground Bash timeout is insufficient for implementation-phase Codex runs and forced workers into ad-hoc background dispatch with lost output. The polling contract below is the formal replacement.
+   **Dispatch (background, no foreground timeout):**
    ```bash
-   $HOME/.okstra/bin/okstra-codex-exec.sh "<absolute-project-root>" "<assigned-model-execution-value>" "<absolute-prompt-history-path>"
+   $HOME/.okstra/bin/okstra-codex-exec.sh "<absolute-project-root>" "<assigned-model-execution-value>" "<absolute-prompt-history-path>" "<absolute-worktree-path>"
    ```
-   Substitute the literal extracted Project Root, model execution value, and prompt-history path in place of the placeholders above. The wrapper handles `-C`, `--model`, `--sandbox workspace-write`, the stdin redirect from the prompt file, and stderr suppression internally. Calling `codex exec` directly (without the wrapper) is an error in this skill: the redirect tokens disqualify the prefix match against `Bash(codex exec:*)` and produce a permission prompt every dispatch.
+   Call `Bash` with `run_in_background: true`. Capture the returned `bash_id` (a.k.a. `shell_id`). Pass the positional arguments verbatim — do NOT use environment variables, `cd`, `&&` chains, or pipes from `cat`. Substitute the literal extracted Project Root, model execution value, prompt-history path, and worktree path. The fourth argument is **mandatory for implementation phase** (extract from `EXECUTOR_WORKTREE_PATH` in the lead prompt's run context or the `**Worktree:**` / `cwd for every mutating command:` line) and **may be omitted only for non-implementation analysis phases** that do not mutate the worktree. Omitting it during implementation will cause every Edit/Write to fail with EPERM. The wrapper handles `-C`, `--add-dir`, `--model`, `--sandbox workspace-write`, the stdin redirect from the prompt file, and stderr suppression internally. Calling `codex exec` directly (without the wrapper) is an error in this skill: the redirect tokens disqualify the prefix match against `Bash(codex exec:*)` and produce a permission prompt every dispatch.
+   **Poll loop (60-second cadence, 30-minute hard cap):**
+   - Record `start_ts` at dispatch time.
+   - Repeat:
+     1. Foreground `Bash` call: `sleep 60` (or shorter on the first iteration if you expect a fast finish).
+     2. Call `BashOutput(bash_id: <shell_id>)`. Inspect `status`.
+     3. If `status == "completed"`: break out of the loop and proceed to step 8.
+     4. If `now - start_ts > 1800` seconds: call `KillShell(shell_id: <shell_id>)`, then record a `cli-failure` event with `--error-type cli-failure`, `--exit-code 124`, `--duration-ms 1800000`, `--message "okstra-codex-exec.sh exceeded 30m polling cap"`, and return `CODEX_CLI_TIMEOUT: codex exec exceeded 30-minute polling cap`.
+     5. Otherwise continue polling.
+   - Do NOT abort the loop on transient `running` status. Only `completed` or the 30-minute cap end it.
+   - Do NOT issue parallel `BashOutput` calls or speculate about progress between polls.
-8. Return the codex output as-is without modification.
+8. Concatenate the wrapper's accumulated stdout from `BashOutput` and return it as-is without modification. If the final `BashOutput` reports a non-zero `exit_code`, follow the **CLI failure** rule in §"Error reporting" before returning.
 ## Stop Condition
 This wrapper is a thin Bash-execution shell over the Codex CLI (via `okstra-codex-exec.sh`). The CLI process itself is the analysis engine; this subagent's only job is to dispatch it and forward output. Therefore:
-- Return immediately after the wrapper script call returns (or after recording any required `cli-failure` event for a non-zero exit / timeout / rate-limit).
-- Do NOT perform additional `Read`, `Grep`, `Glob`, or other tool calls before or after the CLI dispatch beyond what is strictly required by steps 1–7 (prompt persistence, Project Root extraction, model resolution).
+- Return immediately after the polling loop exits with `completed` (or after recording any required `cli-failure` event for a non-zero exit / 30-minute cap / rate-limit).
+- The only tool calls permitted during the polling loop are `Bash` (for `sleep`), `BashOutput`, and — on the timeout path only — `KillShell`. Do NOT perform additional `Read`, `Grep`, `Glob` calls between polls; do NOT inspect intermediate wrapper output mid-run.
+- Outside the polling loop, no `Read`, `Grep`, or `Glob` beyond what is strictly required by steps 1–7 (prompt persistence, Project Root extraction, model resolution).
 - Do NOT re-invoke `okstra-codex-exec.sh` to "double-check" or "rerun for safety" — convergence (Phase 5.5) handles cross-worker reconciliation. A single CLI dispatch per dispatched-prompt is the contract.
 The Codex CLI's own exit terminates the underlying analysis; this wrapper terminates by returning its captured output (or sentinel).
@@ -96,9 +112,9 @@ This wrapper does NOT invoke MCP tools directly. MCP availability inside the Cod
 - The assigned model execution value is canonical for CLI execution. Do not substitute a different Codex model unless the task bundle explicitly changes it.
 - Pass the prompt received from Lead directly to codex after persisting the exact prompt to the assigned path.
 - Include context (code, diff, file paths) if provided.
-- For long prompts, the wrapper script reads from the saved project-local prompt history file via stdin redirect internally. The caller invokes the wrapper with three positional args:
+- For long prompts, the wrapper script reads from the saved project-local prompt history file via stdin redirect internally. The caller invokes the wrapper with three required positional args + the worktree path for implementation phase:
   ```bash
-  $HOME/.okstra/bin/okstra-codex-exec.sh "<literal-project-root>" "<assigned-model-execution-value>" "<literal-prompt-history-path>"
+  $HOME/.okstra/bin/okstra-codex-exec.sh "<literal-project-root>" "<assigned-model-execution-value>" "<literal-prompt-history-path>" "<literal-worktree-path>"
   ```
 - If the parent directory does not exist yet, create it before writing the prompt file.
@@ -146,9 +162,12 @@ two kinds of errors via `scripts/okstra-error-log.py`:
    then append. Lead will dump it to the run error log after this subagent
    terminates.
-2. **CLI failure (lead-observed)** — if `codex exec` returns non-zero, times
-   out (Bash 120000ms), or returns a rate-limit/auth message, immediately
-   append a `cli-failure` event directly to the run error log:
+2. **CLI failure (lead-observed)** — if the wrapper's final `BashOutput`
+   reports a non-zero `exit_code`, the 30-minute polling cap is hit, or the
+   captured stdout/stderr carries a rate-limit/auth message, immediately
+   append a `cli-failure` event directly to the run error log. The
+   30-minute-cap path additionally requires a prior `KillShell` call against
+   the dispatched `bash_id`:
    ```bash
    python3 scripts/okstra-error-log.py append-observed \
@@ -158,7 +177,7 @@ two kinds of errors via `scripts/okstra-error-log.py`:
      --agent codex-worker --agent-role worker \
      --model "<assigned-model-execution-value>" \
      --error-type cli-failure \
-     --command "$HOME/.okstra/bin/okstra-codex-exec.sh <project-root> <m> <prompt-path>" \
+     --command "$HOME/.okstra/bin/okstra-codex-exec.sh <project-root> <m> <prompt-path> <worktree-path>" \
      --command-kind cli-invoke \
      --exit-code <N> --duration-ms <ms> \
      --message "<one-line summary>" \

package/runtime/agents/workers/gemini-worker.md CHANGED Viewed

@@ -18,7 +18,7 @@ description: |
   </example>
 model: inherit
 color: green
-tools: ["Bash", "Read", "Write", "Glob", "Grep"]
+tools: ["Bash", "BashOutput", "KillShell", "Read", "Write", "Glob", "Grep"]
 ---
 You are a Gemini worker agent. Your job is to execute the Google Gemini CLI and return the analysis result.
@@ -27,12 +27,14 @@ You are a Gemini worker agent. Your job is to execute the Google Gemini CLI and
 **Required form (uses the okstra wrapper to avoid redirect-triggered permission prompts):**
 ```bash
-$HOME/.okstra/bin/okstra-gemini-exec.sh "<absolute-project-root>" "<assigned-model-execution-value>" "<absolute-prompt-history-path>"
+$HOME/.okstra/bin/okstra-gemini-exec.sh "<absolute-project-root>" "<assigned-model-execution-value>" "<absolute-prompt-history-path>" [<absolute-worktree-path>]
 ```
+The fourth argument is **mandatory for implementation phase** and optional otherwise. It must be the literal `EXECUTOR_WORKTREE_PATH` recorded in the run context; the wrapper appends it to gemini's `--include-directories` list so the model can both read and operate on the worktree alongside project-root.
 The wrapper internally runs:
 ```bash
-gemini -p - -m "<model>" -o text --include-directories "<project-root>" < "<prompt-path>" 2>/dev/null
+gemini -p - -m "<model>" -o text --include-directories "<project-root>[,<worktree-path>]" < "<prompt-path>" 2>/dev/null
 ```
 The wrapper exists because Claude Code's Bash permission matcher rejects simple-prefix matches when the command contains stdin/stderr redirects. Calling `gemini -p - ... < <path> 2>/dev/null` directly triggers a permission prompt every dispatch even when `Bash(gemini:*)` is allowlisted. The wrapper folds the redirects inside, so the harness sees a single non-redirect command that matches `Bash($HOME/.okstra/bin/okstra-gemini-exec.sh:*)`.
@@ -67,20 +69,34 @@ The wrapper exists because Claude Code's Bash permission matcher rejects simple-
    - If no assigned model execution value can be determined, immediately return `GEMINI_MODEL_MISSING: assigned Gemini model execution value was not provided`. Do NOT fall back to training-data defaults — historical Gemini defaults (e.g. `gemini-1.5-flash`) are NOT acceptable substitutes for the assigned model. Returning the sentinel is the correct behavior; the lead is responsible for fixing its prompt and redispatching.
    - This rule applies equally to convergence reverify rounds. The reverify prompt MUST carry the same `**Model:**` line as the initial run (see `okstra-convergence` skill, "Required reverify-prompt anchor headers"). If the line is absent in a reverify prompt, return `GEMINI_MODEL_MISSING` rather than guessing.
-7. If installed, execute the prompt through the wrapper as a single command with a Bash timeout of 120000 ms. Pass the three positional arguments verbatim — do NOT use environment variables, `cd`, `&&` chains, or pipes from `cat`:
+7. If installed, dispatch the wrapper as a **background** Bash command and poll for completion. The two-minute foreground Bash timeout is insufficient for implementation-phase Gemini runs and forced workers into ad-hoc background dispatch with lost output. The polling contract below is the formal replacement.
+   **Dispatch (background, no foreground timeout):**
    ```bash
-   $HOME/.okstra/bin/okstra-gemini-exec.sh "<absolute-project-root>" "<assigned-model-execution-value>" "<absolute-prompt-history-path>"
+   $HOME/.okstra/bin/okstra-gemini-exec.sh "<absolute-project-root>" "<assigned-model-execution-value>" "<absolute-prompt-history-path>" "<absolute-worktree-path>"
    ```
-   Substitute the literal extracted Project Root, model execution value, and prompt-history path in place of the placeholders above. The wrapper handles `-p -`, `-m`, `-o text`, `--include-directories`, the stdin redirect from the prompt file, and stderr suppression internally. Calling `gemini` directly (without the wrapper) is an error in this skill: the redirect tokens disqualify the prefix match against `Bash(gemini:*)` and produce a permission prompt every dispatch.
+   Call `Bash` with `run_in_background: true`. Capture the returned `bash_id` (a.k.a. `shell_id`). Pass the positional arguments verbatim — do NOT use environment variables, `cd`, `&&` chains, or pipes from `cat`. Substitute the literal extracted Project Root, model execution value, prompt-history path, and worktree path. The fourth argument is **mandatory for implementation phase** (extract from `EXECUTOR_WORKTREE_PATH` in the lead prompt's run context or the `**Worktree:**` / `cwd for every mutating command:` line) and **may be omitted only for non-implementation analysis phases** that do not mutate the worktree. The wrapper handles `-p -`, `-m`, `-o text`, `--include-directories`, the stdin redirect from the prompt file, and stderr suppression internally. Calling `gemini` directly (without the wrapper) is an error in this skill: the redirect tokens disqualify the prefix match against `Bash(gemini:*)` and produce a permission prompt every dispatch.
+   **Poll loop (60-second cadence, 30-minute hard cap):**
+   - Record `start_ts` at dispatch time.
+   - Repeat:
+     1. Foreground `Bash` call: `sleep 60` (or shorter on the first iteration if you expect a fast finish).
+     2. Call `BashOutput(bash_id: <shell_id>)`. Inspect `status`.
+     3. If `status == "completed"`: break out of the loop and proceed to step 8.
+     4. If `now - start_ts > 1800` seconds: call `KillShell(shell_id: <shell_id>)`, then record a `cli-failure` event with `--error-type cli-failure`, `--exit-code 124`, `--duration-ms 1800000`, `--message "okstra-gemini-exec.sh exceeded 30m polling cap"`, and return `GEMINI_CLI_TIMEOUT: gemini exec exceeded 30-minute polling cap`.
+     5. Otherwise continue polling.
+   - Do NOT abort the loop on transient `running` status. Only `completed` or the 30-minute cap end it.
+   - Do NOT issue parallel `BashOutput` calls or speculate about progress between polls.
-8. Return the gemini output as-is without modification.
+8. Concatenate the wrapper's accumulated stdout from `BashOutput` and return it as-is without modification. If the final `BashOutput` reports a non-zero `exit_code`, follow the **CLI failure** rule in §"Error reporting" before returning.
 ## Stop Condition
 This wrapper is a thin Bash-execution shell over the Gemini CLI (via `okstra-gemini-exec.sh`). The CLI process itself is the analysis engine; this subagent's only job is to dispatch it and forward output. Therefore:
-- Return immediately after the wrapper script call returns (or after recording any required `cli-failure` event for a non-zero exit / timeout / rate-limit).
-- Do NOT perform additional `Read`, `Grep`, `Glob`, or other tool calls before or after the CLI dispatch beyond what is strictly required by steps 1–7 (prompt persistence, Project Root extraction, model resolution).
+- Return immediately after the polling loop exits with `completed` (or after recording any required `cli-failure` event for a non-zero exit / 30-minute cap / rate-limit).
+- The only tool calls permitted during the polling loop are `Bash` (for `sleep`), `BashOutput`, and — on the timeout path only — `KillShell`. Do NOT perform additional `Read`, `Grep`, `Glob` calls between polls; do NOT inspect intermediate wrapper output mid-run.
+- Outside the polling loop, no `Read`, `Grep`, or `Glob` beyond what is strictly required by steps 1–7 (prompt persistence, Project Root extraction, model resolution).
 - Do NOT re-invoke `okstra-gemini-exec.sh` to "double-check" or "rerun for safety" — convergence (Phase 5.5) handles cross-worker reconciliation. A single CLI dispatch per dispatched-prompt is the contract.
 The Gemini CLI's own exit terminates the underlying analysis; this wrapper terminates by returning its captured output (or sentinel).
@@ -96,9 +112,9 @@ This wrapper does NOT invoke MCP tools directly. MCP availability inside the Gem
 - The assigned model execution value is canonical for CLI execution. Do not substitute a different Gemini model unless the task bundle explicitly changes it.
 - Pass the prompt received from Lead directly to gemini after persisting the exact prompt to the assigned path.
 - Include context (code, diff, file paths) if provided.
-- For long prompts, dispatch through the wrapper with literal absolute paths:
+- For long prompts, dispatch through the wrapper with literal absolute paths (plus the worktree path for implementation phase):
   ```bash
-  $HOME/.okstra/bin/okstra-gemini-exec.sh "<literal-project-root>" "<assigned-model-execution-value>" "<literal-prompt-history-path>"
+  $HOME/.okstra/bin/okstra-gemini-exec.sh "<literal-project-root>" "<assigned-model-execution-value>" "<literal-prompt-history-path>" "<literal-worktree-path>"
   ```
 - If the parent directory does not exist yet, create it before writing the prompt file.
@@ -146,9 +162,12 @@ two kinds of errors via `scripts/okstra-error-log.py`:
    then append. Lead will dump it to the run error log after this subagent
    terminates.
-2. **CLI failure (lead-observed)** — if `gemini` returns non-zero, times out
-   (Bash 120000ms), or returns a rate-limit/auth message, immediately append
-   a `cli-failure` event directly to the run error log:
+2. **CLI failure (lead-observed)** — if the wrapper's final `BashOutput`
+   reports a non-zero `exit_code`, the 30-minute polling cap is hit, or the
+   captured stdout/stderr carries a rate-limit/auth message, immediately
+   append a `cli-failure` event directly to the run error log. The
+   30-minute-cap path additionally requires a prior `KillShell` call against
+   the dispatched `bash_id`:
    ```bash
    python3 scripts/okstra-error-log.py append-observed \
@@ -158,7 +177,7 @@ two kinds of errors via `scripts/okstra-error-log.py`:
      --agent gemini-worker --agent-role worker \
      --model "<assigned-model-execution-value>" \
      --error-type cli-failure \
-     --command "$HOME/.okstra/bin/okstra-gemini-exec.sh <project-root> <m> <prompt-path>" \
+     --command "$HOME/.okstra/bin/okstra-gemini-exec.sh <project-root> <m> <prompt-path> <worktree-path>" \
      --command-kind cli-invoke \
      --exit-code <N> --duration-ms <ms> \
      --message "<one-line summary>" \

package/runtime/bin/okstra-codex-exec.sh CHANGED Viewed

@@ -13,20 +13,29 @@
 #   Bash($HOME/.okstra/bin/okstra-codex-exec.sh:*)
 #
 # Usage:
-#   okstra-codex-exec.sh <project-root> <model-execution-value> <prompt-path>
+#   okstra-codex-exec.sh <project-root> <model-execution-value> <prompt-path> [worktree-path]
 #
-# All three arguments are required and must be absolute paths or literal model
-# strings. The wrapper exits non-zero on any preflight failure.
+# project-root / model-execution-value / prompt-path are required.
+#
+# worktree-path is optional and used for okstra implementation phase, where the
+# executor must mutate files inside a git worktree that lives outside
+# project-root. When supplied (non-empty), it is forwarded to codex as
+# `--add-dir <worktree-path>` so the codex sandbox grants write access to that
+# directory alongside the primary workspace anchored at project-root. When
+# omitted or empty, no `--add-dir` is added (existing analysis-phase behavior).
+#
+# The wrapper exits non-zero on any preflight failure.
 set -euo pipefail
-if [[ $# -ne 3 ]]; then
-  printf 'usage: %s <project-root> <model-execution-value> <prompt-path>\n' "$(basename "$0")" >&2
+if [[ $# -lt 3 || $# -gt 4 ]]; then
+  printf 'usage: %s <project-root> <model-execution-value> <prompt-path> [worktree-path]\n' "$(basename "$0")" >&2
   exit 64
 fi
 project_root="$1"
 model="$2"
 prompt_path="$3"
+worktree_path="${4-}"
 if [[ -z "$project_root" || ! -d "$project_root" ]]; then
   printf 'okstra-codex-exec: project-root is missing or not a directory: %q\n' "$project_root" >&2
@@ -48,6 +57,16 @@ if ! command -v codex >/dev/null 2>&1; then
   exit 127
 fi
+if [[ -n "$worktree_path" && ! -d "$worktree_path" ]]; then
+  printf 'okstra-codex-exec: worktree-path was provided but is not a directory: %q\n' "$worktree_path" >&2
+  exit 68
+fi
+extra_args=()
+if [[ -n "$worktree_path" ]]; then
+  extra_args+=(--add-dir "$worktree_path")
+fi
 # stdin redirect and stderr suppression are intentionally inside the wrapper —
 # this is the entire reason this script exists.
-exec codex exec -C "$project_root" --model "$model" --sandbox workspace-write - < "$prompt_path" 2>/dev/null
+exec codex exec -C "$project_root" ${extra_args[@]+"${extra_args[@]}"} --model "$model" --sandbox workspace-write - < "$prompt_path" 2>/dev/null

package/runtime/bin/okstra-gemini-exec.sh CHANGED Viewed

@@ -13,20 +13,29 @@
 #   Bash($HOME/.okstra/bin/okstra-gemini-exec.sh:*)
 #
 # Usage:
-#   okstra-gemini-exec.sh <project-root> <model-execution-value> <prompt-path>
+#   okstra-gemini-exec.sh <project-root> <model-execution-value> <prompt-path> [worktree-path]
 #
-# All three arguments are required and must be absolute paths or literal model
-# strings. The wrapper exits non-zero on any preflight failure.
+# project-root / model-execution-value / prompt-path are required.
+#
+# worktree-path is optional and used for okstra implementation phase, where the
+# executor must mutate files inside a git worktree that lives outside
+# project-root. When supplied (non-empty), it is appended to gemini's
+# `--include-directories` list (comma-separated) so the model can see and
+# operate on the worktree alongside the primary workspace. When omitted or
+# empty, only project-root is included (existing analysis-phase behavior).
+#
+# The wrapper exits non-zero on any preflight failure.
 set -euo pipefail
-if [[ $# -ne 3 ]]; then
-  printf 'usage: %s <project-root> <model-execution-value> <prompt-path>\n' "$(basename "$0")" >&2
+if [[ $# -lt 3 || $# -gt 4 ]]; then
+  printf 'usage: %s <project-root> <model-execution-value> <prompt-path> [worktree-path]\n' "$(basename "$0")" >&2
   exit 64
 fi
 project_root="$1"
 model="$2"
 prompt_path="$3"
+worktree_path="${4-}"
 if [[ -z "$project_root" || ! -d "$project_root" ]]; then
   printf 'okstra-gemini-exec: project-root is missing or not a directory: %q\n' "$project_root" >&2
@@ -48,8 +57,18 @@ if ! command -v gemini >/dev/null 2>&1; then
   exit 127
 fi
+if [[ -n "$worktree_path" && ! -d "$worktree_path" ]]; then
+  printf 'okstra-gemini-exec: worktree-path was provided but is not a directory: %q\n' "$worktree_path" >&2
+  exit 68
+fi
+include_dirs="$project_root"
+if [[ -n "$worktree_path" ]]; then
+  include_dirs="$project_root,$worktree_path"
+fi
 # stdin redirect and stderr suppression are intentionally inside the wrapper —
 # this is the entire reason this script exists. Gemini CLI has no `--cd` flag,
 # so workspace correctness is anchored via `--include-directories` plus the
 # Project Root referenced in the prompt body itself.
-exec gemini -p - -m "$model" -o text --include-directories "$project_root" < "$prompt_path" 2>/dev/null
+exec gemini -p - -m "$model" -o text --include-directories "$include_dirs" < "$prompt_path" 2>/dev/null

package/runtime/prompts/profiles/implementation.md CHANGED Viewed

@@ -35,6 +35,7 @@
   - Base ref: `{{EXECUTOR_WORKTREE_BASE_REF}}` — commit SHA the worktree was branched from at the first phase; canonical `<base>` for every `git diff` / `git log` in this run.
   - Provisioning note: `{{EXECUTOR_WORKTREE_NOTE}}`
   - **Executor behaviour**: when status is `created` or `reused`, the Executor MUST run every Edit / Write / build / test / commit command with the working tree path above as cwd. Treat it as `project_root` for the duration of this run. Do NOT mutate the caller's original checkout. Do NOT `cd` out of the worktree to reach files; if a file outside the worktree is needed, the dependency is a planning gap — record it in `Out-of-plan edits` and continue.
+    - **How to set cwd per Bash call**: the Claude Bash tool inherits its cwd from the lead session, which is NOT the worktree. To put cwd-sensitive toolchains (`cargo`, `npm`, `pnpm`, `bun`, `pytest`, `make`, `go`) into the worktree, prefix the command with `cd {{EXECUTOR_WORKTREE_PATH}} && ` inside the same Bash invocation — e.g. `cd {{EXECUTOR_WORKTREE_PATH}} && cargo test -p foo`. **Never wrap in `bash -lc "..."` or `bash -c "..."`** — the wrapper hides the leading `cd` token from Claude Code's permission auto-allow layer (causing prompts on every call) without any safety benefit. For tools that accept an explicit working-directory flag (`git -C <path>`, `cargo --manifest-path`, `pytest --rootdir`), prefer that form over the `cd && ` chain. Edit / Write / Read tool calls already use absolute paths and need no cwd handling. The codex / gemini executor CLI wrappers (`okstra-codex-exec.sh -C`, `okstra-gemini-exec.sh --include-directories`) already inject worktree cwd at the CLI layer, so this rule applies primarily to the Claude executor.
   - **Verifier behaviour**: all verifier roles in the resolved roster read from the SAME working tree path so they observe the exact diff the Executor produced. Verifiers remain strictly read-only there.
   - **Lifecycle**: the worktree is kept after the run completes (no automatic cleanup) and is reused by every subsequent phase of the same task-key. Cleanup, when the task is fully done, is manual: `git -C <main-worktree> worktree remove <path>` followed by `git -C <main-worktree> branch -D <branch>`, plus removing the task-key entry from `~/.okstra/worktrees/registry.json`.
   - **Skipped paths**: when status is `skipped-in-worktree` or `skipped-not-git`, the executor operates in `project_root` as before. Cite the status in the final report's metadata header so reviewers know which path was taken.

package/runtime/skills/okstra-team-contract/SKILL.md CHANGED Viewed

@@ -270,6 +270,12 @@ Workers MUST omit `source` / `recordedAt` / `agent` / `agentRole` / `model` /
 Workers MUST use only `errorType: "tool-failure"` in the **sidecar file**.
 - `cli-failure` events are recorded by the wrapper subagent itself (Codex / Gemini), but **directly to the run-level error log** via `okstra-error-log.py append-observed --error-type cli-failure ...` — NOT via the sidecar. The sidecar is an in-process tool-failure channel only.
+- **Wrapper invocation arity.** Both `okstra-codex-exec.sh` and `okstra-gemini-exec.sh` accept four positional arguments: `<project-root> <model> <prompt-path> [<worktree-path>]`. The fourth (worktree) argument is **mandatory for implementation phase** and optional otherwise. For codex it becomes `--add-dir <worktree>` (sandbox write access); for gemini it is appended to `--include-directories`. Omitting it during implementation causes the codex sandbox to reject every Edit/Write targeting the worktree with EPERM. Workers extract the path from the `**Worktree:**` / `EXECUTOR_WORKTREE_PATH` / `cwd for every mutating command:` line in the lead prompt.
+- **Background dispatch + polling contract (Codex / Gemini wrappers).** Both wrapper subagents MUST dispatch `okstra-codex-exec.sh` / `okstra-gemini-exec.sh` via `Bash(run_in_background: true)` and poll with `BashOutput(bash_id)` on a 60-second cadence, capped at 30 minutes (1800s). The legacy "single foreground `Bash` with 120000ms timeout" rule is retired — it forced workers into ad-hoc background dispatch that lost stdout and silently broke Phase 5 synthesis. The new rule applies in **every phase** (analysis runs typically complete in 1–2 polls, so there is no regression for short jobs). Recording responsibilities:
+  - Successful completion: return the wrapper's accumulated stdout from the final `BashOutput`. No log entry.
+  - Non-zero `exit_code` reported by `BashOutput`: record a `cli-failure` to the run-level error log with the real `exit_code` and observed `duration-ms`.
+  - 30-minute polling cap exceeded: call `KillShell(shell_id)` first, then record `cli-failure` with `--exit-code 124 --duration-ms 1800000 --message "<wrapper> exceeded 30m polling cap"`, then return the language-specific `*_CLI_TIMEOUT` sentinel.
+  - Token-usage matching is unaffected: the wrapper subagent stays alive throughout polling, so the wrapper's jsonl timestamp window continues to cover the underlying CLI rollout's full duration (see §"Token-usage accounting" below).
 - `contract-violation` events (C) are recorded by Lead via `okstra-error-log.py append-observed --error-type contract-violation ...` after inspecting worker outputs.
 - Lead's responsibility regarding the sidecar is to dump it to the run-level error log via `okstra-error-log.py append-from-worker` after each worker terminates; Lead does not write into the sidecar.

package/runtime/templates/reports/settings.template.json CHANGED Viewed

@@ -8,7 +8,9 @@
       "NotebookRead",
       "NotebookEdit",
       "Edit",
+      "Edit($HOME/.okstra/worktrees/**)",
       "Write",
+      "Write($HOME/.okstra/worktrees/**)",
       "WebFetch",
       "WebSearch",
       "TodoWrite",