okstra 0.48.0 → 0.50.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/docs/kr/architecture.md +8 -8
- package/docs/kr/cli.md +2 -2
- package/docs/project-structure-overview.md +3 -3
- package/docs/superpowers/plans/2026-06-05-compact-markdown-report-tables.md +323 -0
- package/docs/superpowers/plans/2026-06-05-wizard-batch-prompts.md +559 -0
- package/docs/superpowers/specs/2026-06-05-compact-markdown-report-tables-design.md +87 -0
- package/docs/superpowers/specs/2026-06-05-wizard-batch-prompts-design.md +121 -0
- package/docs/task-process/error-analysis.md +1 -1
- package/docs/task-process/final-verification.md +1 -1
- package/docs/task-process/release-handoff.md +1 -1
- package/docs/task-process/requirements-discovery.md +1 -1
- package/package.json +1 -1
- package/runtime/BUILD.json +2 -2
- package/runtime/agents/SKILL.md +3 -3
- package/runtime/agents/workers/claude-worker.md +1 -1
- package/runtime/agents/workers/codex-worker.md +1 -1
- package/runtime/agents/workers/gemini-worker.md +1 -1
- package/runtime/agents/workers/report-writer-worker.md +3 -3
- package/runtime/bin/lib/okstra/tmux-pane.sh +40 -0
- package/runtime/bin/okstra-codex-exec.sh +17 -21
- package/runtime/bin/okstra-gemini-exec.sh +12 -15
- package/runtime/bin/okstra-render-report-views.py +1 -1
- package/runtime/bin/okstra-trace-cleanup.sh +13 -1
- package/runtime/prompts/launch.template.md +1 -1
- package/runtime/prompts/profiles/_common-contract.md +15 -15
- package/runtime/prompts/profiles/_implementation-deliverable.md +1 -1
- package/runtime/prompts/profiles/_implementation-executor.md +1 -1
- package/runtime/prompts/profiles/_implementation-verifier.md +1 -1
- package/runtime/prompts/profiles/error-analysis.md +1 -1
- package/runtime/prompts/profiles/final-verification.md +2 -2
- package/runtime/prompts/profiles/implementation-planning.md +9 -9
- package/runtime/prompts/profiles/improvement-discovery.md +5 -5
- package/runtime/prompts/profiles/release-handoff.md +2 -2
- package/runtime/prompts/profiles/requirements-discovery.md +2 -2
- package/runtime/python/okstra_ctl/clarification_items.py +11 -11
- package/runtime/python/okstra_ctl/render.py +1 -1
- package/runtime/python/okstra_ctl/render_final_report.py +1 -1
- package/runtime/python/okstra_ctl/report_views.py +26 -39
- package/runtime/python/okstra_ctl/run.py +3 -3
- package/runtime/python/okstra_ctl/wizard.py +90 -3
- package/runtime/python/okstra_ctl/workflow.py +1 -1
- package/runtime/skills/okstra-brief/SKILL.md +1 -1
- package/runtime/skills/okstra-convergence/SKILL.md +8 -8
- package/runtime/skills/okstra-report-writer/SKILL.md +22 -22
- package/runtime/skills/okstra-run/SKILL.md +2 -0
- package/runtime/skills/okstra-team-contract/SKILL.md +1 -1
- package/runtime/templates/project-docs/task-index.template.md +1 -8
- package/runtime/templates/reports/final-report.template.md +194 -198
- package/runtime/templates/reports/i18n/en.json +16 -17
- package/runtime/templates/reports/i18n/ko.json +16 -17
- package/runtime/templates/reports/implementation-planning-input.template.md +1 -1
- package/runtime/templates/reports/release-handoff-input.template.md +1 -1
- package/runtime/templates/reports/schedule.template.md +3 -7
- package/runtime/templates/reports/user-response.template.md +1 -1
- package/runtime/templates/worker-prompt-preamble.md +1 -1
- package/runtime/validators/lib/fixtures.sh +2 -2
- package/runtime/validators/validate-implementation-plan-stages.py +9 -9
- package/runtime/validators/validate-report-views.py +10 -10
- package/runtime/validators/validate-run.py +36 -36
- package/runtime/validators/validate_improvement_report.py +8 -8
|
@@ -0,0 +1,121 @@
|
|
|
1
|
+
# okstra wizard 멀티탭 배치 프롬프트 설계
|
|
2
|
+
|
|
3
|
+
- 작성일: 2026-06-05
|
|
4
|
+
- 대상: `scripts/okstra_ctl/wizard.py`, `skills/okstra-run/SKILL.md`
|
|
5
|
+
- 상태: 설계 승인 대기
|
|
6
|
+
|
|
7
|
+
## 1. 배경 / 문제
|
|
8
|
+
|
|
9
|
+
`okstra-run` 스킬은 wizard 상태 머신이 내보내는 프롬프트를 한 개씩
|
|
10
|
+
`AskUserQuestion` 으로 렌더하고, 사용자가 답하면 다음 프롬프트를 받는 **답-대기 왕복**
|
|
11
|
+
구조다 ([wizard.py:2238](../../../scripts/okstra_ctl/wizard.py), [SKILL.md:96](../../../skills/okstra-run/SKILL.md)).
|
|
12
|
+
|
|
13
|
+
`use_defaults=False`(customize) 분기에서는 서로 의존이 없는 픽이 줄줄이 개별 프롬프트로
|
|
14
|
+
나온다:
|
|
15
|
+
|
|
16
|
+
- 모델: `lead_model` → (impl) `executor_model` / (그 외) 로스터별 `claude_model`·`codex_model`·`gemini_model` → `report_writer_model` — 최대 5회
|
|
17
|
+
- 옵션: `directive_pick` → `related_tasks_pick` → `clarification_pick` → (release-handoff) `pr_template_pick` — 최대 4회
|
|
18
|
+
|
|
19
|
+
각 픽이 별도 왕복이라 전체 입력에 시간이 오래 걸린다. **비슷한 유형의 독립 질문을 탭으로
|
|
20
|
+
묶어 한 번의 `AskUserQuestion` 으로 받아** 왕복 수를 줄이는 것이 목표다.
|
|
21
|
+
|
|
22
|
+
## 2. 핵심 원칙
|
|
23
|
+
|
|
24
|
+
**기존 `Step` 은 그대로 두고 "방출(presentation) 계층" 에만 배치 개념을 추가한다.**
|
|
25
|
+
|
|
26
|
+
- `answered` / `owns` / edit-rewind(`_reset_from`) / `_ready_for_confirm` 의 `custom_ids`
|
|
27
|
+
검사는 전부 **개별 step id 단위**로 유지된다 ([wizard.py:2149](../../../scripts/okstra_ctl/wizard.py)).
|
|
28
|
+
- 그룹은 "서로 의존이 없는 픽 step 들을 한 화면에 모아 내보내는" 래퍼일 뿐, step 자체를
|
|
29
|
+
대체하지 않는다. 따라서 검증·되감기 로직은 무손상이다.
|
|
30
|
+
|
|
31
|
+
## 3. 변경 사항
|
|
32
|
+
|
|
33
|
+
### 3.1 새 prompt kind `pick_group`
|
|
34
|
+
|
|
35
|
+
`Prompt` 직렬화에 멀티-질문 모양을 추가한다 ([wizard.py:299](../../../scripts/okstra_ctl/wizard.py)).
|
|
36
|
+
|
|
37
|
+
```json
|
|
38
|
+
{
|
|
39
|
+
"step": "<group-id>",
|
|
40
|
+
"kind": "pick_group",
|
|
41
|
+
"questions": [
|
|
42
|
+
{ "step": "lead_model", "label": "...", "options": [ {"value":"..","label":".."} ], "multi": false },
|
|
43
|
+
{ "step": "claude_model", "label": "...", "options": [...], "multi": false }
|
|
44
|
+
]
|
|
45
|
+
}
|
|
46
|
+
```
|
|
47
|
+
|
|
48
|
+
- `questions[]` 한 항목 = `AskUserQuestion` 탭 하나.
|
|
49
|
+
- 각 항목은 기존 step 의 `build()` 가 만든 `label` / `options` 를 그대로 재사용한다(중복 정의 금지).
|
|
50
|
+
- 기존 `kind: "pick"` / `"text"` / `"done"` 은 변경 없음.
|
|
51
|
+
|
|
52
|
+
### 3.2 그룹 정의
|
|
53
|
+
|
|
54
|
+
customize 분기의 **픽 step 만** 그룹화한다. 순서가 있는 명시적 정의를 wizard.py 에 둔다.
|
|
55
|
+
|
|
56
|
+
| 그룹 id | 멤버 step (적용 가능할 때만) |
|
|
57
|
+
|---|---|
|
|
58
|
+
| `models` | `lead_model`, (impl) `executor_model` / (그 외) `claude_model`·`codex_model`·`gemini_model`, `report_writer_model` |
|
|
59
|
+
| `options` | `directive_pick`, `related_tasks_pick`, `clarification_pick`, (release-handoff) `pr_template_pick` |
|
|
60
|
+
|
|
61
|
+
**개별 유지(그룹화 금지) 대상**:
|
|
62
|
+
|
|
63
|
+
- `*_TEXT` 후속(`directive`, `related_tasks`, `clarification`, `pr_template`): "직접 입력"
|
|
64
|
+
선택 시에만 나타나는 조건부 텍스트 입력. `AskUserQuestion` 은 텍스트 입력을 지원하지
|
|
65
|
+
않으므로 ([SKILL.md:50](../../../skills/okstra-run/SKILL.md)) 개별 텍스트 프롬프트로 유지.
|
|
66
|
+
- `workers_override`: 어떤 모델 탭(claude/codex/gemini)이 적용되는지를 결정하므로 `models`
|
|
67
|
+
그룹보다 **반드시 선행**해야 한다. 개별 유지.
|
|
68
|
+
- `pr_template_scope`: `pr_template_path` 가 정해진 뒤에야 적용되므로 개별 유지.
|
|
69
|
+
|
|
70
|
+
### 3.3 엔진 `next_prompt` 수정
|
|
71
|
+
|
|
72
|
+
[wizard.py:2238](../../../scripts/okstra_ctl/wizard.py) 의 다음-프롬프트 결정 로직:
|
|
73
|
+
|
|
74
|
+
1. 기존대로 첫 번째 "적용 가능 + 미답변" step 을 찾는다.
|
|
75
|
+
2. 그 step 이 그룹 멤버가 아니면 기존 `Prompt` 를 그대로 반환(동작 불변).
|
|
76
|
+
3. 그룹 멤버이면 같은 그룹의 **적용 가능 + 미답변 픽 멤버**를 정의 순서대로 **최대 4개**까지
|
|
77
|
+
모아 `pick_group` 으로 반환한다.
|
|
78
|
+
- `AskUserQuestion` 의 질문 수 한도가 4개이기 때문.
|
|
79
|
+
- 비-implementation 풀 로스터(lead+claude+codex+gemini+report = 5)는 첫 4개만 한 배치로
|
|
80
|
+
나가고, 답변 후 5번째가 다음 배치로 자동 분리된다. **스킬에 청크 로직 불필요**, 최악 2화면.
|
|
81
|
+
|
|
82
|
+
### 3.4 제출 경로 수정
|
|
83
|
+
|
|
84
|
+
[wizard.py:2233](../../../scripts/okstra_ctl/wizard.py) 의 advance, CLI `--answer` 파싱([wizard.py:2363](../../../scripts/okstra_ctl/wizard.py)):
|
|
85
|
+
|
|
86
|
+
- 현재 활성 프롬프트가 `pick_group` 이면 `--answer` 를 **JSON 객체 문자열**로 받는다.
|
|
87
|
+
```
|
|
88
|
+
okstra wizard step --state-file <path> --answer '{"lead_model":"opus","claude_model":"default","report_writer_model":""}'
|
|
89
|
+
```
|
|
90
|
+
- 엔진은 각 키를 해당 멤버 step 의 `submit()` 으로 라우팅하고, 각 멤버를 개별적으로
|
|
91
|
+
`answered` 에 추가한다.
|
|
92
|
+
- 키 누락 / 빈 값("" 또는 "default")은 기존 `_validate_model` 규칙대로 "phase 기본값"으로
|
|
93
|
+
처리된다 ([wizard.py:418](../../../scripts/okstra_ctl/wizard.py)). 멤버 중 하나라도
|
|
94
|
+
검증 실패하면 `ok:false` 로 같은 그룹을 재-프롬프트한다(부분 적용 금지: 전부 검증 통과해야
|
|
95
|
+
`answered` 마킹).
|
|
96
|
+
|
|
97
|
+
### 3.5 SKILL.md 렌더 규칙 추가
|
|
98
|
+
|
|
99
|
+
[SKILL.md:41-50](../../../skills/okstra-run/SKILL.md), [SKILL.md:96](../../../skills/okstra-run/SKILL.md):
|
|
100
|
+
|
|
101
|
+
- `kind: "pick_group"` → `questions[]` 를 탭(질문)으로 갖는 `AskUserQuestion` **1회** 호출.
|
|
102
|
+
탭마다 `label` + `options`, `multiSelect` 는 `questions[].multi`.
|
|
103
|
+
- 사용자의 탭별 선택값(`options[].value`)을 `questions[].step` 키로 묶어 JSON 문자열을 만들고,
|
|
104
|
+
단일 `--answer '<json>'` 으로 제출.
|
|
105
|
+
- 리터럴-토큰 권한 규칙 유지: `--answer` 값은 셸 변수/`$(...)` 없이 리터럴 JSON 문자열로 전달.
|
|
106
|
+
- 검증 실패(`ok:false`) 시 동일 그룹 재-프롬프트.
|
|
107
|
+
|
|
108
|
+
## 4. 영향 / 비영향
|
|
109
|
+
|
|
110
|
+
- **동작 변경**: customize 분기의 모델·옵션 픽이 멀티탭으로 묶여 왕복 횟수 감소
|
|
111
|
+
(모델 3~5픽 → 1~2회, 옵션 3~4픽 → 1회).
|
|
112
|
+
- **불변**: `use_defaults=True` 경로, identity 단계(task pick/type/base-ref/plan/executor),
|
|
113
|
+
confirm/branch_confirm/edit-target, render-bundle 인자 매핑([wizard.py:2249](../../../scripts/okstra_ctl/wizard.py)).
|
|
114
|
+
- **"직접 입력"** 을 고른 항목만 텍스트 후속이 개별 프롬프트로 남는다(불가피).
|
|
115
|
+
|
|
116
|
+
## 5. 테스트
|
|
117
|
+
|
|
118
|
+
- 기존 wizard 단위 테스트(`tests/`) 가 개별 step 계약을 그대로 통과해야 한다(그룹은 방출 계층만 변경).
|
|
119
|
+
- 신규: `pick_group` 방출(멤버 4개 초과 → 2배치 분리), JSON `--answer` 라우팅(부분 검증 실패 시
|
|
120
|
+
재-프롬프트), workers_override 선행 → models 그룹 로스터 반영을 커버하는 테스트 추가.
|
|
121
|
+
- `node bin/okstra --version` / `bash validators/validate-workflow.sh` 회귀 확인.
|
|
@@ -87,7 +87,7 @@ final report는 다음을 담아야 한다.
|
|
|
87
87
|
- evidence-backed cause analysis
|
|
88
88
|
- uncertainty boundary
|
|
89
89
|
- practical next diagnostic steps
|
|
90
|
-
- blocking uncertainty가 있으면 `##
|
|
90
|
+
- blocking uncertainty가 있으면 `## 1. Clarification Items`, 보통 `Blocks=next-phase`
|
|
91
91
|
|
|
92
92
|
금지되는 것은 source edit, refactor, fix attempt, implementation design artifact, build/migration/deploy 실행이다. code나 log로 답할 수 있는 ambiguity를 사용자 질문으로 넘기는 것도 profile상 defect다.
|
|
93
93
|
|
|
@@ -108,7 +108,7 @@ flowchart TD
|
|
|
108
108
|
Blockers --> Followup
|
|
109
109
|
```
|
|
110
110
|
|
|
111
|
-
`##
|
|
111
|
+
`## 7. Final Verdict`에는 `Verdict Token` field가 정확히 하나 들어가야 하며 값은 다음 셋 중 하나다.
|
|
112
112
|
|
|
113
113
|
- `accepted`: release-handoff로 넘겨도 되는 상태
|
|
114
114
|
- `conditional-accept`: 조건을 모두 명시해야 하며, 조건이 gate면 다음 phase를 막는다
|
|
@@ -83,7 +83,7 @@ flowchart TD
|
|
|
83
83
|
lead는 사용자에게 push/PR 여부를 묻기 전에 다음을 확인한다.
|
|
84
84
|
|
|
85
85
|
- task brief가 `## Source Verification Report`를 가리킨다.
|
|
86
|
-
- 그 report의 `##
|
|
86
|
+
- 그 report의 `## 7. Final Verdict`에 `Verdict Token = accepted`가 정확히 있다.
|
|
87
87
|
- working tree가 clean이다.
|
|
88
88
|
- 현재 branch가 `main`, `master`, `prod`, `preprod`, `staging`, `dev` 같은 base branch가 아니다.
|
|
89
89
|
- `<base>..HEAD` commit range가 비어 있지 않다.
|
|
@@ -100,7 +100,7 @@ final report는 다음을 특히 강조한다.
|
|
|
100
100
|
- missing input과 uncertainty boundary
|
|
101
101
|
- 다음 phase 및 safe resume guidance
|
|
102
102
|
- `terminology:*` brief item에 대한 canonical term resolution
|
|
103
|
-
- blocking input이 있으면 `##
|
|
103
|
+
- blocking input이 있으면 `## 1. Clarification Items` unified table에 `Blocks=next-phase`
|
|
104
104
|
|
|
105
105
|
Non-goal은 source edit, plan authoring, build, deployment다.
|
|
106
106
|
|
package/package.json
CHANGED
package/runtime/BUILD.json
CHANGED
package/runtime/agents/SKILL.md
CHANGED
|
@@ -310,12 +310,12 @@ Distinct from Phase 5.5 finding convergence:
|
|
|
310
310
|
|
|
311
311
|
Lead's responsibilities in this sub-step (in order):
|
|
312
312
|
|
|
313
|
-
1. Extract `P-*` plan items from the draft report's `##
|
|
313
|
+
1. Extract `P-*` plan items from the draft report's `## 5.5 Implementation Plan Deliverables` per the prefix → source-section mapping in the convergence skill.
|
|
314
314
|
2. Dispatch a single plan-body reverify round to every analyser worker in the roster (`claude`, `codex`, and `gemini` when opted in). `Report writer worker` is NOT a participant in this round.
|
|
315
315
|
3. Aggregate verdicts and resolve the gate result to one of `passed` / `passed-with-dissent` / `blocked-by-disagreement` / `aborted-non-result`.
|
|
316
316
|
4. Write `runs/<task-type>/state/plan-body-verification.json` (schema in the convergence skill).
|
|
317
|
-
5. Populate `###
|
|
318
|
-
6. For every `majority-disagree` plan item, append a row to `##
|
|
317
|
+
5. Populate `### 5.5.9 Plan Body Verification` in the final-report file (template at `templates/reports/final-report.template.md` §5.5.9 — Round count, Gate result, Verdict table, Dissent log).
|
|
318
|
+
6. For every `majority-disagree` plan item, append a row to `## 1. Clarification Items` with `Blocks=approval` and the 1:1 ID match in the verdict table's `Classification` column (`majority-disagree → C-<N>`). Do NOT create a parallel `Open Questions` block — see `prompts/profiles/implementation-planning.md` self-review step 6 for the orphan-on-either-side contract.
|
|
319
319
|
7. Conditionally render the top-of-report `- [ ] Approved` marker line: present iff gate ∈ {passed, passed-with-dissent}, absent iff gate ∈ {blocked-by-disagreement, aborted-non-result}. `validators/validate-run.py` `validate_phase_boundary` enforces this correspondence. Manually flipping a blocked gate to passing in order to render the marker is a contract violation.
|
|
320
320
|
|
|
321
321
|
If `convergence.planBodyVerification.enabled == false` (set by `--no-plan-verification` or by `okstra config set plan-verification off`), the entire sub-step is skipped and the top-of-report Approval marker is rendered unconditionally (legacy behaviour). This opt-out is intended for fast iteration only and is not recommended for handoff-ready plans.
|
|
@@ -78,7 +78,7 @@ When returning results, start the file with a YAML frontmatter block, then organ
|
|
|
78
78
|
|
|
79
79
|
Include file paths and line numbers when discussing code evidence.
|
|
80
80
|
|
|
81
|
-
**Item IDs (mandatory).** Every row in sections 1–5 (and any optional section 6) MUST carry a worker-internal item ID unique within this file. Use the leading column for table-form items (`F-001`, `M-001`, `S-001`, `U-001`, `R-001` per section) or a `[<ID>]` prefix for bullet/numbered items. The ID shape is your choice but it MUST appear — the lead's §
|
|
81
|
+
**Item IDs (mandatory).** Every row in sections 1–5 (and any optional section 6) MUST carry a worker-internal item ID unique within this file. Use the leading column for table-form items (`F-001`, `M-001`, `S-001`, `U-001`, `R-001` per section) or a `[<ID>]` prefix for bullet/numbered items. The ID shape is your choice but it MUST appear — the lead's §6.1 / §6.2 / §2.1 synthesis preserves these IDs in its `Source items (worker:item)` column to keep cross-worker traceability intact. See `prompts/profiles/_common-contract.md` "Cross-worker traceability" SSOT.
|
|
82
82
|
|
|
83
83
|
**Ticket tagging:** For runs whose task type is `requirements-discovery`, `error-analysis`, `implementation-planning`, or `implementation`, every item in sections 1–5 MUST carry a ticket identifier. Use the `Ticket ID` column in table-form items and the `[TICKETID: <id>]` prefix in bullet/numbered items. Fill priority: `Issue / Ticket` from the input → `Task ID` (no prefix, e.g. `8852`) → `unknown`. Multiple tickets are comma-separated. Full rules live in the `okstra-team-contract` skill's Ticket Tagging section.
|
|
84
84
|
|
|
@@ -161,7 +161,7 @@ When returning results, start the file with a YAML frontmatter block, then organ
|
|
|
161
161
|
|
|
162
162
|
Include file paths and line numbers when discussing code evidence.
|
|
163
163
|
|
|
164
|
-
**Item IDs (mandatory).** Every row in sections 1–5 (and any optional section 6) MUST carry a worker-internal item ID unique within this file. Codex tends to use hierarchical numbering (`1.1`, `1.2`, `1.3`, ...); that shape is fine — keep what's natural. What matters is that each item is addressable. The lead's §
|
|
164
|
+
**Item IDs (mandatory).** Every row in sections 1–5 (and any optional section 6) MUST carry a worker-internal item ID unique within this file. Codex tends to use hierarchical numbering (`1.1`, `1.2`, `1.3`, ...); that shape is fine — keep what's natural. What matters is that each item is addressable. The lead's §6.1 / §6.2 / §2.1 synthesis preserves these IDs as `codex:<your-id>` entries in its `Source items (worker:item)` column. See `prompts/profiles/_common-contract.md` "Cross-worker traceability" SSOT.
|
|
165
165
|
|
|
166
166
|
**Ticket tagging:** For runs whose task type is `requirements-discovery`, `error-analysis`, `implementation-planning`, or `implementation`, every item in sections 1–5 MUST carry a ticket identifier. Use the `Ticket ID` column in table-form items and the `[TICKETID: <id>]` prefix in bullet/numbered items. Fill priority: `Issue / Ticket` from the input → `Task ID` (no prefix, e.g. `8852`) → `unknown`. Multiple tickets are comma-separated. Full rules live in the `okstra-team-contract` skill's Ticket Tagging section.
|
|
167
167
|
|
|
@@ -161,7 +161,7 @@ When returning results, start the file with a YAML frontmatter block, then organ
|
|
|
161
161
|
|
|
162
162
|
Include file paths and line numbers when discussing code evidence.
|
|
163
163
|
|
|
164
|
-
**Item IDs (mandatory).** Every row in sections 1–5 (and any optional section 6) MUST carry a worker-internal item ID unique within this file. Gemini may use `F-1`, `F-2`, ... or numbered hierarchical IDs — either is fine. What matters is that each item is addressable. The lead's §
|
|
164
|
+
**Item IDs (mandatory).** Every row in sections 1–5 (and any optional section 6) MUST carry a worker-internal item ID unique within this file. Gemini may use `F-1`, `F-2`, ... or numbered hierarchical IDs — either is fine. What matters is that each item is addressable. The lead's §6.1 / §6.2 / §2.1 synthesis preserves these IDs as `gemini:<your-id>` entries in its `Source items (worker:item)` column. See `prompts/profiles/_common-contract.md` "Cross-worker traceability" SSOT.
|
|
165
165
|
|
|
166
166
|
**Ticket tagging:** For runs whose task type is `requirements-discovery`, `error-analysis`, `implementation-planning`, or `implementation`, every item in sections 1–5 MUST carry a ticket identifier. Use the `Ticket ID` column in table-form items and the `[TICKETID: <id>]` prefix in bullet/numbered items. Fill priority: `Issue / Ticket` from the input → `Task ID` (no prefix, e.g. `8852`) → `unknown`. Multiple tickets are comma-separated. Full rules live in the `okstra-team-contract` skill's Ticket Tagging section.
|
|
167
167
|
|
|
@@ -71,9 +71,9 @@ For the report writer specifically, the `## Inputs` list always includes:
|
|
|
71
71
|
- `<instruction-set>/final-report-template.md` — the **phase-stripped** Jinja2 template the renderer uses (only this run's §4.x deliverable block remains). Read it to understand which data.json fields appear where in the rendered markdown; do NOT edit it, and do NOT pull the full `templates/reports/final-report.template.md` source.
|
|
72
72
|
- `templates/reports/i18n/en.json` and `templates/reports/i18n/ko.json`.
|
|
73
73
|
- Every analysis worker's result file under `worker-results/`.
|
|
74
|
-
- `state/convergence-<task-type>-<seq>.json` (if present). When present, reproduce its `roundHistory[]`, `round2SkippedReason`, and `finalClassificationCounts` verbatim into the final report's Section
|
|
74
|
+
- `state/convergence-<task-type>-<seq>.json` (if present). When present, reproduce its `roundHistory[]`, `round2SkippedReason`, and `finalClassificationCounts` verbatim into the final report's Section 6 Round History sub-table — do not recompute from worker results.
|
|
75
75
|
|
|
76
|
-
For the carry-in `clarification-response.md` (if present), walk every row of `##
|
|
76
|
+
For the carry-in `clarification-response.md` (if present), walk every row of `## 1. Clarification Items` including rows whose `User input` cell is blank — a blank cell with `Status=open` is a signal you must surface in the conditional `## 0. Clarification Response Carried In From Previous Run` section (the template's `RENDER_IF` guard activates it when the carry-in path is non-empty). When no carry-in path was provided, OMIT the `## 0.` heading entirely — do NOT write an empty-state stub.
|
|
77
77
|
|
|
78
78
|
Write a Reading Confirmation block to your **audit sidecar** at `runs/<task-type>/worker-results/report-writer-worker-audit-<task-type>-<seq>.md`. The main final-report and the main worker-results file MUST NOT contain a `## 0. Reading Confirmation` heading. If you cannot truthfully confirm a file end-to-end, record a `tool-failure` in the errors sidecar instead of fabricating the report.
|
|
79
79
|
|
|
@@ -100,7 +100,7 @@ Rules (the schema enforces most of these — they are listed here so you know *w
|
|
|
100
100
|
- If evidence is missing, write `"I don't know"` in the relevant statement field rather than fabricating confidence.
|
|
101
101
|
- Cite file paths and line numbers in every `evidence.primary[].source` / `consensus[].evidence` cell.
|
|
102
102
|
- Preserve every analysis worker's ticket tagging — every row's `ticketId` field carries the ticket key or the task-fallback. For single-ticket runs, set `ticketCoverage` to `{"singleTicket": "<ticket>"}`. For runs that do not require ticket tagging (`release-handoff`, `final-verification`), set `ticketCoverage` to `{"omit": true}`.
|
|
103
|
-
- When the `Task Type` is `improvement-discovery`, populate `##
|
|
103
|
+
- When the `Task Type` is `improvement-discovery`, populate `## 5.9 Improvement Candidates` with the 10-column schema enforced by `validators/validate-improvement-report.py`. Source the row IDs (`I-NNN`), lens whitelist, and Source workers patterns from `scripts/okstra_ctl/improvement_lenses.py` — do NOT introduce new lens names or worker prefixes.
|
|
104
104
|
|
|
105
105
|
Write the data.json with your `Write` tool against the absolute `Result Path`. Then invoke the renderer (`Bash`): `python3 scripts/okstra-render-final-report.py <data.json path>`. Confirm both files exist and respond with a short status line: `data.json written to <abs path>; markdown rendered to <abs path>. Sections populated: <count>.`
|
|
106
106
|
|
|
@@ -0,0 +1,40 @@
|
|
|
1
|
+
#!/usr/bin/env bash
|
|
2
|
+
# Resolve the tmux pane that the CURRENT process actually runs in.
|
|
3
|
+
#
|
|
4
|
+
# Why this exists: Claude Code's Bash tool strips $TMUX and $TMUX_PANE, so a
|
|
5
|
+
# bare `tmux display-message -p '#{pane_id}'` does NOT return the caller's pane
|
|
6
|
+
# — it returns the pane of the most-recently-active tmux *client*, which (when
|
|
7
|
+
# the user has several attached sessions) is frequently a DIFFERENT session
|
|
8
|
+
# than the one the okstra run lives in. Earlier trace-pane fixes all trusted
|
|
9
|
+
# `display-message` and therefore mis-placed (or dropped) the tail pane.
|
|
10
|
+
#
|
|
11
|
+
# This resolver instead walks the process's own ancestor PIDs and matches them
|
|
12
|
+
# against the tmux server's pane_pids. That is deterministic and correct
|
|
13
|
+
# regardless of $TMUX/$TMUX_PANE or which client is active: when the process is
|
|
14
|
+
# a descendant of a tmux pane's shell it finds exactly that pane; when it is not
|
|
15
|
+
# inside any tmux pane (e.g. Claude launched from the macOS GUI app) no ancestor
|
|
16
|
+
# matches and the function prints nothing.
|
|
17
|
+
#
|
|
18
|
+
# Usage: pane="$(okstra_resolve_caller_pane)" # empty => not in a tmux pane
|
|
19
|
+
# Optional arg: a starting PID (defaults to $$) — used by the regression test.
|
|
20
|
+
# bash 3.2 safe (no associative arrays).
|
|
21
|
+
okstra_resolve_caller_pane() {
|
|
22
|
+
command -v tmux >/dev/null 2>&1 || return 0
|
|
23
|
+
local panes
|
|
24
|
+
panes="$(tmux list-panes -a -F '#{pane_pid} #{pane_id}' 2>/dev/null)" || return 0
|
|
25
|
+
[ -n "$panes" ] || return 0
|
|
26
|
+
|
|
27
|
+
local pid="${1:-$$}"
|
|
28
|
+
local depth=0
|
|
29
|
+
local hit
|
|
30
|
+
while [ -n "$pid" ] && [ "$pid" != "0" ] && [ "$depth" -lt 16 ]; do
|
|
31
|
+
hit="$(printf '%s\n' "$panes" | awk -v p="$pid" '$1==p {print $2; exit}')"
|
|
32
|
+
if [ -n "$hit" ]; then
|
|
33
|
+
printf '%s\n' "$hit"
|
|
34
|
+
return 0
|
|
35
|
+
fi
|
|
36
|
+
pid="$(ps -o ppid= -p "$pid" 2>/dev/null | tr -d ' ')"
|
|
37
|
+
depth=$((depth + 1))
|
|
38
|
+
done
|
|
39
|
+
return 0
|
|
40
|
+
}
|
|
@@ -183,36 +183,32 @@ status_path="${prompt_path%.md}.status.json"
|
|
|
183
183
|
[[ "$status_path" == "$prompt_path" ]] && status_path="${prompt_path}.status.json"
|
|
184
184
|
started_ts=$(date +%s)
|
|
185
185
|
script_dir="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd -P)"
|
|
186
|
+
# Trace-pane caller resolution helper (okstra_resolve_caller_pane). The lib dir
|
|
187
|
+
# is a bin-sibling in both repo (scripts/lib/...) and installed
|
|
188
|
+
# (~/.okstra/bin/lib/...) layouts; degrade silently if absent.
|
|
189
|
+
[ -r "$script_dir/lib/okstra/tmux-pane.sh" ] && . "$script_dir/lib/okstra/tmux-pane.sh"
|
|
186
190
|
python3 "$script_dir/okstra-wrapper-status.py" \
|
|
187
191
|
init "$status_path" "$(basename "$0")" "$role" "$$" "$started_ts" "$log_path" \
|
|
188
192
|
>>"$log_path" 2>&1 || true
|
|
189
193
|
|
|
190
194
|
# Derive the okstra run dir from the prompt path. paths.py is the SSOT:
|
|
191
195
|
# dispatched prompts live at `<RUN_DIR>/prompts/<cli>-worker-prompt<NNN>.md`,
|
|
192
|
-
# so the run dir is two levels up. Used to
|
|
193
|
-
#
|
|
194
|
-
# can find exactly this run's panes without any tmux env var. Empty if the
|
|
196
|
+
# so the run dir is two levels up. Used to tag the trace pane so cleanup can
|
|
197
|
+
# find exactly this run's panes without any tmux env var. Empty if the
|
|
195
198
|
# derivation fails — every dependent step below then degrades to a no-op.
|
|
196
199
|
run_dir="$(cd "$(dirname "$prompt_path")/.." 2>/dev/null && pwd -P || true)"
|
|
197
|
-
lead_pane_file="${run_dir:+$run_dir/state/lead-pane.id}"
|
|
198
200
|
|
|
199
|
-
# Resolve the pane
|
|
200
|
-
#
|
|
201
|
-
#
|
|
202
|
-
#
|
|
203
|
-
#
|
|
204
|
-
#
|
|
205
|
-
#
|
|
206
|
-
#
|
|
207
|
-
caller_pane="
|
|
208
|
-
if
|
|
209
|
-
|
|
210
|
-
if [[ -n "$cand" ]] && tmux display-message -p -t "$cand" '#{pane_id}' >/dev/null 2>&1; then
|
|
211
|
-
caller_pane="$cand"
|
|
212
|
-
fi
|
|
213
|
-
fi
|
|
214
|
-
if [[ -z "$caller_pane" ]]; then
|
|
215
|
-
caller_pane=$(tmux display-message -p '#{pane_id}' 2>/dev/null || true)
|
|
201
|
+
# Resolve the pane THIS wrapper actually runs in by walking our ancestor PIDs
|
|
202
|
+
# and matching tmux pane_pids (see lib/okstra/tmux-pane.sh). Reliable
|
|
203
|
+
# regardless of $TMUX/$TMUX_PANE (stripped by Claude Code's Bash tool) and of
|
|
204
|
+
# which tmux client is currently active — a bare `tmux display-message` would
|
|
205
|
+
# instead return the most-recently-active client's pane, frequently a DIFFERENT
|
|
206
|
+
# session than the okstra run, which is why earlier approaches mis-placed or
|
|
207
|
+
# dropped the trace pane. Empty = not inside a tmux pane (e.g. Claude launched
|
|
208
|
+
# from the GUI app) → the trace split below is skipped.
|
|
209
|
+
caller_pane=""
|
|
210
|
+
if type okstra_resolve_caller_pane >/dev/null 2>&1; then
|
|
211
|
+
caller_pane="$(okstra_resolve_caller_pane)"
|
|
216
212
|
fi
|
|
217
213
|
|
|
218
214
|
# Pane titles: worker (caller) pane gets `codex-<role>-<pid>`; the sibling
|
|
@@ -132,28 +132,25 @@ status_path="${prompt_path%.md}.status.json"
|
|
|
132
132
|
[[ "$status_path" == "$prompt_path" ]] && status_path="${prompt_path}.status.json"
|
|
133
133
|
started_ts=$(date +%s)
|
|
134
134
|
script_dir="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd -P)"
|
|
135
|
+
# Trace-pane caller resolution helper (okstra_resolve_caller_pane). The lib dir
|
|
136
|
+
# is a bin-sibling in both repo (scripts/lib/...) and installed
|
|
137
|
+
# (~/.okstra/bin/lib/...) layouts; degrade silently if absent.
|
|
138
|
+
[ -r "$script_dir/lib/okstra/tmux-pane.sh" ] && . "$script_dir/lib/okstra/tmux-pane.sh"
|
|
135
139
|
python3 "$script_dir/okstra-wrapper-status.py" \
|
|
136
140
|
init "$status_path" "$(basename "$0")" "$role" "$$" "$started_ts" "$log_path" \
|
|
137
141
|
>>"$log_path" 2>&1 || true
|
|
138
142
|
|
|
139
143
|
# Resolve the run dir and the trace-split anchor pane. See
|
|
140
|
-
# `okstra-codex-exec.sh` for the full rationale —
|
|
141
|
-
# `<RUN_DIR>` from the prompt path (paths.py SSOT) to
|
|
142
|
-
#
|
|
143
|
-
#
|
|
144
|
-
#
|
|
144
|
+
# `okstra-codex-exec.sh` / `lib/okstra/tmux-pane.sh` for the full rationale —
|
|
145
|
+
# kept in lock-step: derive `<RUN_DIR>` from the prompt path (paths.py SSOT) to
|
|
146
|
+
# tag the trace pane, and resolve the caller pane by walking our ancestor PIDs
|
|
147
|
+
# against tmux pane_pids (reliable even though `$TMUX`/`$TMUX_PANE` are stripped
|
|
148
|
+
# and the wrapper runs backgrounded). Empty = not inside a tmux pane → skip.
|
|
145
149
|
run_dir="$(cd "$(dirname "$prompt_path")/.." 2>/dev/null && pwd -P || true)"
|
|
146
|
-
lead_pane_file="${run_dir:+$run_dir/state/lead-pane.id}"
|
|
147
150
|
|
|
148
|
-
caller_pane="
|
|
149
|
-
if
|
|
150
|
-
|
|
151
|
-
if [[ -n "$cand" ]] && tmux display-message -p -t "$cand" '#{pane_id}' >/dev/null 2>&1; then
|
|
152
|
-
caller_pane="$cand"
|
|
153
|
-
fi
|
|
154
|
-
fi
|
|
155
|
-
if [[ -z "$caller_pane" ]]; then
|
|
156
|
-
caller_pane=$(tmux display-message -p '#{pane_id}' 2>/dev/null || true)
|
|
151
|
+
caller_pane=""
|
|
152
|
+
if type okstra_resolve_caller_pane >/dev/null 2>&1; then
|
|
153
|
+
caller_pane="$(okstra_resolve_caller_pane)"
|
|
157
154
|
fi
|
|
158
155
|
|
|
159
156
|
# Pane titles: worker (caller) pane gets `gemini-<role>-<pid>`; the sibling
|
|
@@ -129,7 +129,7 @@ def main(argv: list[str] | None = None) -> int:
|
|
|
129
129
|
meta = RunMeta(task_key=task_key, task_type=task_type, seq=seq, source_report=source_report)
|
|
130
130
|
html_path = render_html_view(report_path, run_meta=meta, css=css, js=js)
|
|
131
131
|
if html_path is None:
|
|
132
|
-
print("html: skipped (no §
|
|
132
|
+
print("html: skipped (no §1 clarification rows — html view carries no interactive forms for this report)")
|
|
133
133
|
else:
|
|
134
134
|
print(f"html: {html_path}")
|
|
135
135
|
return 0
|
|
@@ -37,6 +37,14 @@
|
|
|
37
37
|
|
|
38
38
|
set -u
|
|
39
39
|
|
|
40
|
+
# Trace-pane caller resolution helper (okstra_resolve_caller_pane) — see
|
|
41
|
+
# lib/okstra/tmux-pane.sh. Used as the lead-pane fallback below so a missing /
|
|
42
|
+
# stale lead-pane.id resolves to the pane THIS process actually runs in (via
|
|
43
|
+
# ancestor-PID ↔ tmux pane_pid matching), never a foreign active-client pane.
|
|
44
|
+
# Bin-sibling path in repo + installed layouts; degrade silently if absent.
|
|
45
|
+
_clean_script_dir="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd -P)"
|
|
46
|
+
[ -r "$_clean_script_dir/lib/okstra/tmux-pane.sh" ] && . "$_clean_script_dir/lib/okstra/tmux-pane.sh"
|
|
47
|
+
|
|
40
48
|
MODE="kill" # kill | list
|
|
41
49
|
REAP=0
|
|
42
50
|
run_dir=""
|
|
@@ -88,7 +96,11 @@ if [[ "$REAP" -eq 0 ]]; then
|
|
|
88
96
|
[[ -r "$lead_pane_file" ]] && lead_pane="$(head -n1 "$lead_pane_file" 2>/dev/null || true)"
|
|
89
97
|
fi
|
|
90
98
|
if [[ -z "$lead_pane" ]] || ! tmux display-message -p -t "$lead_pane" '#{pane_id}' >/dev/null 2>&1; then
|
|
91
|
-
|
|
99
|
+
if type okstra_resolve_caller_pane >/dev/null 2>&1; then
|
|
100
|
+
lead_pane="$(okstra_resolve_caller_pane 2>/dev/null || true)"
|
|
101
|
+
else
|
|
102
|
+
lead_pane="$(tmux display-message -p '#{pane_id}' 2>/dev/null || true)"
|
|
103
|
+
fi
|
|
92
104
|
fi
|
|
93
105
|
|
|
94
106
|
# Does a trace pane's tag belong to the set we are closing?
|
|
@@ -91,4 +91,4 @@ Emit one `PROGRESS: <phase-id> <verb-phrase>` line as plain user-facing text at
|
|
|
91
91
|
|
|
92
92
|
- Source path: `{{CLARIFICATION_RESPONSE_RELATIVE_PATH}}`
|
|
93
93
|
- If the source path above is empty, no prior clarification response was attached to this run.
|
|
94
|
-
- If the source path is set, a copy is staged at `{{INSTRUCTION_SET_RELATIVE_PATH}}/clarification-response.md`. Read it before running workers; reconcile each `C-*` row in section
|
|
94
|
+
- If the source path is set, a copy is staged at `{{INSTRUCTION_SET_RELATIVE_PATH}}/clarification-response.md`. Read it before running workers; reconcile each `C-*` row in section 1 (`## 1. Clarification Items`) of the prior report against new evidence and record the outcome in the conditional `## 0. Clarification Response Carried In From Previous Run` section of this run's final report (render that heading only when carry-in is non-empty — the validator fails empty Section 0 stubs).
|
|
@@ -24,14 +24,14 @@ profile document.
|
|
|
24
24
|
- Create, modify, or delete only inside `<PROJECT_ROOT>/.okstra/**` unless the brief verbatim requests a specific non-okstra edit. The phase performing that edit must quote the user instruction in its report. Implementation source edits also require the approved implementation plan.
|
|
25
25
|
- Authority & permissions assumption (applies to every okstra task-type):
|
|
26
26
|
- **Assume the user (and their team) holds full authority and every permission required for the anticipated, in-flight, or follow-up work.** Treat external approvals, third-party access grants, role/IAM permissions, organisational sign-off, legal/compliance review, vendor coordination, and "verify access exists" steps as already satisfied unless the task brief explicitly states otherwise.
|
|
27
|
-
- Do NOT add such items to routing decisions, missing-materials lists, clarification questions, option trade-offs, dependency/migration risk, validation checklists, rollout plans, acceptance blockers, residual risks, release recommendations, the `##
|
|
27
|
+
- Do NOT add such items to routing decisions, missing-materials lists, clarification questions, option trade-offs, dependency/migration risk, validation checklists, rollout plans, acceptance blockers, residual risks, release recommendations, the `## 1. Clarification Items` table, or any day/effort estimate. They are not legitimate sources of schedule extension.
|
|
28
28
|
- Internal okstra phase handoffs (e.g. the `User Approval Request` block in `implementation-planning`) are unaffected — those are the user themselves approving and proceed without external coordination.
|
|
29
29
|
- This rule does NOT relax any phase-specific Forbidden actions list; safety rules in the per-profile document remain in force regardless of the user's authority.
|
|
30
30
|
- Anti-escalation rule (shared):
|
|
31
31
|
- treating "다음 단계 진행해" or equivalent user phrases as authorisation to start a *different* lifecycle phase is forbidden. The next phase begins only in a separate okstra run launched with the new `--task-type`. Per-profile documents may further restrict this within their own scope.
|
|
32
32
|
- Run-start pane recording (shared — runs ONCE at run start, before the FIRST worker dispatch):
|
|
33
|
-
- The wrappers anchor
|
|
34
|
-
- The lead MUST run once, at run start: `mkdir -p "<RUN_DIR>/state" &&
|
|
33
|
+
- The codex/gemini wrappers now self-anchor their trace pane by walking their own ancestor PIDs against tmux `pane_pid`s (see `lib/okstra/tmux-pane.sh`), so they no longer depend on this file. The lead still records its own pane id here for the cleanup steps below (which-pane-to-never-kill) and as the "am I in tmux" gate. A bare `tmux display-message -p '#{pane_id}'` is NOT reliable for this — Claude Code's Bash tool strips `$TMUX`/`$TMUX_PANE`, so that command returns the most-recently-active *client's* pane (often a different session, or a foreign pane when the lead is launched outside tmux entirely). The lead therefore records via the same ancestry resolver.
|
|
34
|
+
- The lead MUST run once, at run start: `mkdir -p "<RUN_DIR>/state" && { . "$HOME/.okstra/bin/lib/okstra/tmux-pane.sh" 2>/dev/null && okstra_resolve_caller_pane; } > "<RUN_DIR>/state/lead-pane.id" 2>/dev/null || true` (substitute the run's absolute `RUN_DIR`). When the lead is not inside a tmux pane (e.g. Claude launched from the GUI app) no ancestor matches a pane, the file is empty, and every pane step below silently no-ops — that empty/absent file is the single signal that the lead is not in tmux.
|
|
35
35
|
- Phase-start pane reset (shared — runs BEFORE dispatching each new worker batch):
|
|
36
36
|
- okstra creates two kinds of tmux pane per run: (a) **worker-agent panes** the harness gives to dispatched subagents (titled `claude-worker` / `codex-worker` / `gemini-worker` / `report-writer-worker`), and (b) **trace panes** the codex/gemini wrappers spawn (`<cli>-<role>-<pid>-tail`). Both accumulate across internal phases because each new phase dispatches a fresh worker batch and the prior panes are never reclaimed.
|
|
37
37
|
- When `<RUN_DIR>/state/lead-pane.id` is non-empty (the lead is in tmux), the lead MUST run `$HOME/.okstra/bin/okstra-trace-cleanup.sh --run-dir "<RUN_DIR>"` **immediately before** dispatching the next phase's workers — i.e. just before emitting each `PROGRESS: phase-5.5-convergence round=<N>` marker and just before `PROGRESS: phase-6-synthesis dispatching report-writer-worker`. This closes every prior-phase okstra pane (worker-agent + trace) for this run, while NEVER killing the lead's own pane.
|
|
@@ -41,8 +41,8 @@ profile document.
|
|
|
41
41
|
- This step is **automatic and silent** — NO user prompt (workers are idle sessions that have already delivered their results; there is nothing for the user to preserve). It runs only when team-state's `teamCreate.status == "ok"` (Teams mode was actually used); in the no-`team_name` fallback there is no team to delete, so silent-skip.
|
|
42
42
|
- Sequence (token-usage collection MUST already be complete — `TeamDelete` removes `~/.claude/teams/<team>/` + `~/.claude/tasks/<team>/` but NOT the `~/.claude/projects/` jsonls Phase 7 reads, yet the read MUST precede teardown):
|
|
43
43
|
1. Read `~/.claude/teams/okstra-<task-key>/config.json` and, for every `members` entry whose name is not the lead, `SendMessage(to: <name>, message: { type: "shutdown_request" })` to terminate it gracefully.
|
|
44
|
-
2.
|
|
45
|
-
3. Call `TeamDelete()
|
|
44
|
+
2. These workers already delivered their results and terminated when their `Agent()` dispatch returned (the lead's completion evidence is the returned output + the existing result/final-report file, not a teardown ack) — a terminated session emits NO shutdown confirmation. Treat `shutdown_request` as best-effort (fire-and-forget); the lead MUST NOT block waiting for acks from addressed teammates. Proceed immediately to step 3.
|
|
45
|
+
3. Call `TeamDelete()` — the single synchronization point for teardown. If it errors with an active-members message, one teammate is genuinely still shutting down: wait briefly, retry `TeamDelete()` once, then proceed regardless of the result. NEVER loop or re-send `shutdown_request`; teardown must never block run completion once the work and final report already exist.
|
|
46
46
|
- Report it in one short line (e.g. `worker 6명 종료 + 팀 해제`) and proceed. Emit `PROGRESS: phase-7-teardown disbanding team` immediately before step 1.
|
|
47
47
|
- Phase wrap-up — okstra pane disposition (shared, MUST be the *last* step before returning control to the user):
|
|
48
48
|
- At run end the only residual okstra panes are the LAST phase's (e.g. the `report-writer-worker` agent pane and any codex/gemini trace pane). `okstra-trace-cleanup.sh --list --run-dir "<RUN_DIR>"` returns one tab-separated `<pane_id>\t<pane_title>` line per residual okstra pane (worker-agent + trace) for this run.
|
|
@@ -61,14 +61,14 @@ profile document.
|
|
|
61
61
|
- **Reporter confirmation precondition (BLOCKING)**: the brief's frontmatter carries `reporter-confirmations: <complete | partial | pending | skipped>` set by `okstra-brief` Step 6.5. Every phase that consumes the brief MUST read this field before doing analysis. The handling matrix is:
|
|
62
62
|
- `complete` → proceed normally.
|
|
63
63
|
- `partial` → proceed; treat still-unmarked `intent-check:` / `conversion-block:` rows as the `skipped` branch.
|
|
64
|
-
- `skipped` → do NOT silently infer the missing answers. Promote each unmarked `intent-check:` / `conversion-block:` row into this run's `##
|
|
65
|
-
- `pending` (or field missing) → ABORT analysis; render the Verdict Card with `Verdict Token = blocked` + `Direction = hold` and write a single `## Reporter Confirmation Required` block (no leading number) summarising which rows are pending. The `##
|
|
64
|
+
- `skipped` → do NOT silently infer the missing answers. Promote each unmarked `intent-check:` / `conversion-block:` row into this run's `## 1. Clarification Items` as `Kind=decision`. Use `Blocks=approval` in `implementation-planning`, where the row gates the User Approval Request; otherwise use `Blocks=next-phase`. The recommended answer is drawn from the brief's matching content and clearly labelled `보고자 직접 확인 권장`.
|
|
65
|
+
- `pending` (or field missing) → ABORT analysis; render the Verdict Card with `Verdict Token = blocked` + `Direction = hold` and write a single `## Reporter Confirmation Required` block (no leading number) summarising which rows are pending. The `## 1. Clarification Items` table carries one row per pending item with `Blocks=approval` in `implementation-planning`, otherwise `Blocks=next-phase`. The operator must rerun `okstra-brief` Step 6.5. Do NOT emit `## 0.` for this case — Section 0 is reserved for clarification-response carry-in only.
|
|
66
66
|
`[CONFIRMED <YYYY-MM-DD> → RC-N]` markers on `Open Questions` rows are the per-row signal that the reporter has answered; their answers live verbatim under `## Reporter Confirmations` in the brief.
|
|
67
67
|
- `Source Material` is reporter-verbatim. Do NOT paraphrase, summarize, reorder, or restructure it. Quote it directly when needed.
|
|
68
68
|
- `Augmentation` entries carry one of four labels — `evidence-link`, `format-conversion`, `terminology-mapping`, `intent-inference`. Treat them as follows:
|
|
69
69
|
- `evidence-link` / `format-conversion` → trust without re-verification.
|
|
70
70
|
- `terminology-mapping` → verify against `<PROJECT_ROOT>/.okstra/glossary.md` (authoritative); raise a `Clarification Items` row if the mapping is missing or contradicts the glossary.
|
|
71
|
-
- `intent-inference` → treat as an **unverified hypothesis**. Every `intent-inference` augmentation MUST be paired in the brief with an `Open Questions` row prefixed `intent-check:`. Promote that row into the run's `##
|
|
71
|
+
- `intent-inference` → treat as an **unverified hypothesis**. Every `intent-inference` augmentation MUST be paired in the brief with an `Open Questions` row prefixed `intent-check:`. Promote that row into the run's `## 1. Clarification Items` table as `Kind=decision, Blocks=next-phase` (or `Blocks=approval` for `implementation-planning`) with the recommended answer set to "보고자에게 직접 확인 후 응답" unless the codebase can be inspected to confirm or refute the inference.
|
|
72
72
|
- `Open Questions` row prefixes are signals — do not strip them when promoting:
|
|
73
73
|
- `intent-check:` → `Kind=decision`, recommended answer = reporter confirmation. NEVER silently resolve an `intent-check:` by inference at this layer.
|
|
74
74
|
- `terminology:` → `Kind=decision`, recommended answer = canonical term from `<PROJECT_ROOT>/.okstra/glossary.md` (or "extend okstra glossary via brief Step 4.5").
|
|
@@ -77,22 +77,22 @@ profile document.
|
|
|
77
77
|
- `general:` → free-form; classify per the standard `Clarification Items` rules.
|
|
78
78
|
- Any decision in this run that contradicts the brief's `Source Material` must be raised back to the reporter via a `Clarification Items` row; it must NOT be silently overridden. Disagreement with the reporter is allowed only after the row is resolved.
|
|
79
79
|
- This contract is the single authority on brief consumption. Phase-specific addenda may *tighten* these rules but may not relax them.
|
|
80
|
-
- Clarification request policy (shared — applies whenever a profile uses `##
|
|
81
|
-
- **Canonical column schema (SSOT — must match `templates/reports/final-report.template.md` §
|
|
80
|
+
- Clarification request policy (shared — applies whenever a profile uses `## 1. Clarification Items`):
|
|
81
|
+
- **Canonical column schema (SSOT — must match `templates/reports/final-report.template.md` §1 exactly):** every `## 1. Clarification Items` table has exactly these 8 columns, in this order:
|
|
82
82
|
`| ID | Ticket ID | Kind | Statement | Expected form | Blocks | Status | User input |`.
|
|
83
83
|
Profile-specific addenda may tighten cell content but MUST NOT add, remove, rename, or reorder columns. The `ID` cell uses `C-NNN` (3-digit zero-padded), the `Status` cell ∈ `{open, answered, resolved, obsolete}`, and the `Kind` / `Blocks` legal values are listed below.
|
|
84
|
-
- section
|
|
84
|
+
- section 1 is a **single unified table** per `final-report-template.md`. Every clarification item — whether the user must attach a file, choose between options, or supply a single number/path — is one row of that table. Do not split it into sub-sections (`1.1 추가 자료 요청` / `1.2 사용자 확인 질문` / `5.5.9 Open Questions` are removed and the validator fails reports that reintroduce them), do not create a parallel table elsewhere in the report, and do not duplicate the same item into the top-of-report `User Approval Request (사용자 승인 게이트)` block or any other section.
|
|
85
85
|
- each row's `Kind` column picks one of `{material, decision, data-point}`: `material` for files / snapshots / logs / screenshots the user must attach (the `User input` cell will hold a path or URL); `decision` for choices and yes/no confirmations only the user can make; `data-point` for a single number, ID, date, or short string the user can answer inline. Items that mix "yes/no + file path if yes" are one row of `Kind=material` with the combined expectation written into `Expected form`.
|
|
86
86
|
- each row's `Blocks` column picks one of `{approval, next-phase, none}`. `approval` is reserved for items that gate an approval action, especially the `implementation-planning` User Approval Request; outside `implementation-planning`, unresolved brief reporter-confirmation rows use `next-phase` instead. `next-phase` blocks the next run from starting cleanly. `none` is informational/audit-only.
|
|
87
87
|
- write every entry in full, descriptive sentences that a non-developer can act on without further context. Avoid abbreviations and internal jargon. The `Statement` cell must state *what* is needed, *why* the answer / attachment changes the next step, and (for `material`) *where* the user can find it and *where* to place it. The `Expected form` cell must state the answer shape (예/아니오, 보기 중 하나, 숫자/날짜, 파일 경로, 짧은 서술 등); supply concrete option choices when applicable.
|
|
88
88
|
- if a phase requires a recommended answer, alternatives, or an evidence-check note, encode it inside the existing 8-column schema: put evidence notes in `Statement` as `Evidence checked: <path:line>` or `Evidence checked: none — <human-only reason>`, and put recommendations/options in `Expected form` as `Recommended: <answer> — <rationale>; Alternatives: <options>`. Do not add `Recommended`, `Evidence`, `Alternatives`, or `evidence-checked` columns.
|
|
89
89
|
- the same `final-report.md` file is the canonical artifact carried into the next run; the user appends answers inline before rerunning. The preferred turn-around is `scripts/okstra.sh --resume-clarification --task-key <project-id>:<task-group>:<task-id>` (opens the latest report in `$EDITOR`, then auto-reruns the same phase with `--clarification-response` carry-in). The lower-level form `--clarification-response <path>` remains available for scripted runs.
|
|
90
|
-
- if a clarification response was carried in for this run, render the conditional `## 0. Clarification Response Carried In From Previous Run` section (the template's `RENDER_IF` guard activates it), walk every `C-*` row of the prior report's `##
|
|
90
|
+
- if a clarification response was carried in for this run, render the conditional `## 0. Clarification Response Carried In From Previous Run` section (the template's `RENDER_IF` guard activates it), walk every `C-*` row of the prior report's `## 1. Clarification Items` table, reconcile each one against new evidence, and update its `Status` to `resolved` or `obsolete` before issuing the next decision/verdict. When no carry-in path was provided, omit the `## 0.` heading entirely — the validator fails reports that emit an empty Section 0 stub (e.g. "No prior clarification response was provided for this run.").
|
|
91
91
|
- Verdict Card (shared — applies to every final-report regardless of profile):
|
|
92
|
-
- The top-of-report `## Verdict Card` block is mandatory in every final-report. Its `Verdict Token`, `Direction`, and `Next Step` cells MUST byte-match the corresponding cells in `##
|
|
93
|
-
- Cross-worker traceability (shared — applies to every analysis worker output and to the lead's `##
|
|
92
|
+
- The top-of-report `## Verdict Card` block is mandatory in every final-report. Its `Verdict Token`, `Direction`, and `Next Step` cells MUST byte-match the corresponding cells in `## 7. Final Verdict` and the first item of `## 3. Recommended Next Steps`. The validator treats the card as a non-authoritative index — when card values diverge from the authoritative sections, the run is `contract-violated`.
|
|
93
|
+
- Cross-worker traceability (shared — applies to every analysis worker output and to the lead's `## 6.` / `## 2.` tables in the final-report):
|
|
94
94
|
- **Worker-side item IDs (free-form but unique within the worker).** Every row item in sections 1–5 (and any optional section 6) of an analysis worker's output MUST carry an item ID that is unique within that one worker's result file. The ID convention is the worker's choice — `F-001` / `F-002` per the suggested schema, `1.1` / `1.2` / `1.3` as Codex tends to use, or any other shape — but it MUST appear as the leading column of the row (for table-form items) or as a `[<ID>]` prefix (for bullet/numbered items). Workers that emit findings without IDs make cross-worker reconciliation impossible.
|
|
95
|
-
- **Lead-side ID assignment + source preservation.** When the lead (or `report-writer-worker`) synthesises `##
|
|
95
|
+
- **Lead-side ID assignment + source preservation.** When the lead (or `report-writer-worker`) synthesises `## 6.1 Consensus` / `## 6.2 Differences` / `## 2.1 Primary Evidence` rows from worker outputs, the lead assigns a fresh `C-NNN` / `D-NNN` / `E-NNN` row ID. The `Source items` column (or, where the template still calls it `Supporting workers` / `Workers (position)` / `Source`, that same column) MUST list every contributing worker:item pair (e.g. `claude:F-001, codex:1.1, gemini:F-3`) so a reviewer can trace the synthesised row back to each worker's original wording without re-reading every worker-results file. Bare worker names without item IDs (e.g. `claude, codex, gemini`) are deprecated for these tables; the validator does not yet fail on them but the readability pass treats it as a contract violation.
|
|
96
96
|
- **Why this matters.** A real run had `claude=F-1..F-11`, `codex=1.1..1.8`, `gemini=F-3..F-9` — three incompatible ID schemes. When the lead synthesised `C-1..C-8`, the link from `C-3` back to "which sentence in which worker file" was lost. Source-item preservation restores that link without forcing every worker to adopt a single ID prefix, which would over-constrain worker output style.
|
|
97
97
|
- Audit sidecar (shared — applies to every analysis-worker output and every final-report):
|
|
98
98
|
- Reading Confirmation lines (one short line per input file confirming end-to-end reading) live in the **worker audit sidecar** at `runs/<task-type>/worker-results/<worker>-audit-<task-type>-<seq>.md`, NOT in the worker's main worker-results file. The worker-results body starts at section 1 (Findings). The validator fails worker-results files that contain a `## 0. Reading Confirmation` heading.
|
|
@@ -30,7 +30,7 @@ are collected and convergence finished. Phase 1-5 do not need it.
|
|
|
30
30
|
- **Feature-flag-gated changes**: confirm the off-switch path was exercised in this run's validation evidence (i.e. one of the validation commands ran with the flag off and succeeded). A plan that ships a flag without exercising the off-path does NOT satisfy this requirement.
|
|
31
31
|
- **Schema migrations, config-format changes, or any change with persisted state**: a **dry-run of the rollback step is mandatory**, not preferred. Record the exact rollback command and its captured exit code / stdout. If the migration tool offers no dry-run mode (`--dry-run`, `--plan`, equivalent), the executor MUST refuse to claim rollback verification and instead end the run with a routing recommendation back to `implementation-planning` for a safer rollback strategy. Skipping this step on a stateful change is treated as a `contract-violated` outcome by `final-verification`.
|
|
32
32
|
- **Routing recommendation for `final-verification`**: brief note on whether the changes are ready for final-verification phase or need a new error-analysis / planning loop first.
|
|
33
|
-
- **Follow-up tasks (Section
|
|
33
|
+
- **Follow-up tasks (Section 4 of the final report)**: every item discovered during this run that was *not* delivered MUST appear in the final report's `## 4. Follow-up Tasks (후속 작업)` table with a concrete `Origin`, `New Task ID`, `Suggested task-type`, `Scope`, and `Reason / Why deferred`. Sources include: out-of-scope discoveries that the executor consciously chose not to fold into this run, verifier concerns the executor declined to fix in-place, scope-boundary items from the approved plan that turned out to need their own ticket, and any unresolved `## 1. Clarification Items` row carried over from the approved plan (`Status` ∈ `{open, answered}` at approval time). An empty section is acceptable but only when expressed as the single line `- 후속 작업 없음.` — silence is treated as a contract violation. Rows with `Auto-spawn? = yes` will be materialised by `scripts/okstra-spawn-followups.py` in Phase 7; rows with `Auto-spawn? = no` MUST also appear in `Section 3. Recommended Next Steps` so the user knows to act manually.
|
|
34
34
|
|
|
35
35
|
## Self-review pass before finalising the report (`Claude lead` runs this; do not delegate to a generic subagent)
|
|
36
36
|
|
|
@@ -27,7 +27,7 @@ until Phase 5 ends, then drop from active context for Phase 6/7.
|
|
|
27
27
|
- Doc-only / config-only / pure-rename steps that have no observable runtime behaviour are exempt from the failing-test requirement, but the executor MUST cite the exemption per step in the final report (`TDD exemption: <reason>`).
|
|
28
28
|
- When the touched area has no existing test harness, the executor MUST stand up the minimum harness needed to host one regression test for this run rather than skipping TDD entirely. Record the harness-bootstrap step as an `Out-of-plan edit` if it is not in the plan.
|
|
29
29
|
- **DB / IO / SQL changes require real execution — mock-only is NOT validation evidence:** when this run's diff touches DB/IO/SQL (ORM / query-builder code — sequelize / typeorm / prisma / knex / raw SQL — `*.repository.*`, model/entity files, `migrations/**`, `*.sql`, or any changed query string), a mocked unit test cannot observe the SQL the query builder actually emits — a mocked suite once passed while `count({ col: 'FontFamily.fontFamily' })` threw `Unknown column` on the real DB. The executor MUST run the change against a real (or faithful-replica) datastore — the `db-test` validation step (plan `validation` db step, else `project.json.qaCommands.db-test`), targeting a **local / replica** DB — and cite its exact command + exit code in the final report's `Validation evidence`. If no real DB / `db-test` command is reachable, do NOT claim the change verified: label the DB portion `정적 분석상 …, 미검증(실행 안 함)` in the report, surface it in the routing recommendation, and never downplay the real run as "too heavy". `git push` stays forbidden (universal list); the unverified DB state is carried forward so `final-verification` cannot accept it and `release-handoff` cannot push.
|
|
30
|
-
- re-read the approved plan end-to-end and parse the `##
|
|
30
|
+
- re-read the approved plan end-to-end and parse the `## 5.5 Stage Map`. Read the **Stage batch** injected in the launch prompt (`Stage batch for this implementation run`): it lists the stage numbers this run owns, ascending. The runtime already selected and reserved this batch — do NOT recompute the start stage from `consumers.jsonl`.
|
|
31
31
|
- for each stage in the batch, load every `runs/<plan-key>/carry/stage-<i>.json` for `i ∈ depends-on(stage)` and inject them into the executor's working context as "runtime carry-in". For `depends-on (none)` stages, no sidecar load — task-brief only.
|
|
32
32
|
- the batch's stages are mutually independent (each one's `depends-on` are all already `status:done`, never another batch member), so execute them in ascending order; each stage's file list, step order, Stage Validation commands, Stage Exit Contract, and rollback path are the authoritative scope for that stage.
|
|
33
33
|
- inspect the current state of every file the plan names; if any file has changed materially since the plan was written, stop and route to a new `implementation-planning` run instead of editing speculatively
|