okstra 0.34.1 → 0.36.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.kr.md +27 -19
- package/README.md +27 -19
- package/docs/kr/architecture.md +59 -45
- package/docs/kr/cli.md +61 -18
- package/docs/pr-template-usage.md +65 -0
- package/docs/project-structure-overview.md +353 -354
- package/docs/superpowers/plans/2026-05-12-ticket-id-in-reports.md +1 -1
- package/docs/superpowers/plans/2026-05-14-convergence-queue-pruning.md +1 -1
- package/docs/superpowers/plans/2026-05-17-dual-format-final-report.md +1 -1
- package/docs/superpowers/plans/2026-05-20-final-report-language.md +1501 -0
- package/docs/superpowers/plans/2026-05-20-implementation-planning-multi-stage.md +1267 -0
- package/docs/superpowers/plans/2026-05-20-okstra-run-prompt-sot-b1.md +1007 -0
- package/docs/superpowers/plans/2026-05-20-wizard-messages-json-sot.md +720 -0
- package/docs/superpowers/plans/2026-05-20-wizard-prompt-json-sot-a1.md +681 -0
- package/docs/superpowers/plans/2026-05-21-improvement-discovery-task-type.md +1691 -0
- package/docs/superpowers/plans/2026-05-24-implementation-lead-context-slimming.md +1700 -0
- package/docs/superpowers/specs/2026-05-20-final-report-language-design.md +383 -0
- package/docs/superpowers/specs/2026-05-20-implementation-planning-multi-stage-design.md +320 -0
- package/docs/superpowers/specs/2026-05-20-okstra-run-prompt-sot-design.md +299 -0
- package/docs/superpowers/specs/2026-05-21-improvement-discovery-task-type-design.md +335 -0
- package/docs/task-process/README.md +74 -0
- package/docs/task-process/common-flow.md +166 -0
- package/docs/task-process/error-analysis.md +101 -0
- package/docs/task-process/final-verification.md +167 -0
- package/docs/task-process/implementation-planning.md +128 -0
- package/docs/task-process/implementation.md +149 -0
- package/docs/task-process/release-handoff.md +206 -0
- package/docs/task-process/requirements-discovery.md +115 -0
- package/package.json +1 -1
- package/runtime/BUILD.json +2 -2
- package/runtime/agents/SKILL.md +30 -7
- package/runtime/agents/workers/claude-worker.md +31 -6
- package/runtime/agents/workers/codex-worker.md +37 -10
- package/runtime/agents/workers/gemini-worker.md +34 -7
- package/runtime/agents/workers/report-writer-worker.md +19 -10
- package/runtime/bin/okstra-central.sh +6 -6
- package/runtime/bin/okstra-codex-exec.sh +49 -28
- package/runtime/bin/okstra-gemini-exec.sh +39 -21
- package/runtime/bin/okstra-render-final-report.py +13 -2
- package/runtime/bin/okstra-wrapper-status.py +155 -0
- package/runtime/bin/okstra.sh +2 -2
- package/runtime/prompts/launch.template.md +1 -0
- package/runtime/prompts/profiles/_common-contract.md +11 -6
- package/runtime/prompts/profiles/_implementation-deliverable.md +53 -0
- package/runtime/prompts/profiles/_implementation-executor.md +60 -0
- package/runtime/prompts/profiles/_implementation-verifier.md +76 -0
- package/runtime/prompts/profiles/error-analysis.md +3 -7
- package/runtime/prompts/profiles/implementation-planning.md +22 -21
- package/runtime/prompts/profiles/implementation.md +28 -118
- package/runtime/prompts/profiles/improvement-discovery.md +42 -0
- package/runtime/prompts/profiles/release-handoff.md +1 -1
- package/runtime/prompts/profiles/requirements-discovery.md +8 -12
- package/runtime/prompts/wizard/prompts.ko.json +230 -0
- package/runtime/python/lib/okstra/cli.sh +2 -49
- package/runtime/python/lib/okstra/globals.sh +21 -21
- package/runtime/python/lib/okstra/interactive.sh +7 -7
- package/runtime/python/okstra_ctl/clarification_items.py +3 -9
- package/runtime/python/okstra_ctl/consumers.py +53 -0
- package/runtime/python/okstra_ctl/final_report_schema.py +0 -7
- package/runtime/python/okstra_ctl/i18n.py +73 -0
- package/runtime/python/okstra_ctl/improvement_lenses.py +44 -0
- package/runtime/python/okstra_ctl/index.py +1 -1
- package/runtime/python/okstra_ctl/paths.py +26 -20
- package/runtime/python/okstra_ctl/render.py +166 -207
- package/runtime/python/okstra_ctl/render_final_report.py +53 -10
- package/runtime/python/okstra_ctl/run.py +299 -108
- package/runtime/python/okstra_ctl/run_context.py +22 -0
- package/runtime/python/okstra_ctl/seeding.py +186 -0
- package/runtime/python/okstra_ctl/session.py +65 -7
- package/runtime/python/okstra_ctl/wizard.py +348 -127
- package/runtime/python/okstra_ctl/workflow.py +21 -2
- package/runtime/python/okstra_ctl/worktree.py +54 -1
- package/runtime/python/okstra_project/resolver.py +4 -3
- package/runtime/python/okstra_token_usage/report.py +2 -2
- package/runtime/schemas/final-report-v1.0.schema.json +22 -16
- package/runtime/skills/okstra-brief/SKILL.md +102 -218
- package/runtime/skills/okstra-convergence/SKILL.md +2 -3
- package/runtime/skills/okstra-inspect/SKILL.md +581 -0
- package/runtime/skills/okstra-report-writer/SKILL.md +35 -15
- package/runtime/skills/okstra-run/SKILL.md +8 -7
- package/runtime/skills/okstra-schedule/SKILL.md +14 -157
- package/runtime/skills/okstra-setup/SKILL.md +28 -1
- package/runtime/skills/okstra-team-contract/SKILL.md +16 -107
- package/runtime/templates/okstra.CLAUDE.md +104 -0
- package/runtime/templates/reports/brief.template.md +204 -0
- package/runtime/templates/reports/final-report.template.md +93 -98
- package/runtime/templates/reports/i18n/en.json +135 -0
- package/runtime/templates/reports/i18n/ko.json +135 -0
- package/runtime/templates/reports/implementation-planning-input.template.md +18 -0
- package/runtime/templates/reports/improvement-discovery-input.template.md +78 -0
- package/runtime/templates/reports/schedule.template.md +12 -3
- package/runtime/templates/reports/task-brief.template.md +2 -2
- package/runtime/templates/worker-prompt-preamble.md +108 -0
- package/runtime/validators/lib/fixtures.sh +30 -0
- package/runtime/validators/lib/runners.sh +1 -1
- package/runtime/validators/validate-implementation-plan-stages.py +211 -0
- package/runtime/validators/validate-run.py +121 -26
- package/runtime/validators/validate-workflow.sh +2 -2
- package/runtime/validators/validate_improvement_report.py +275 -0
- package/src/config.mjs +18 -0
- package/src/install.mjs +41 -14
- package/src/setup.mjs +133 -1
- package/src/uninstall.mjs +27 -3
- package/runtime/skills/okstra-history/SKILL.md +0 -165
- package/runtime/skills/okstra-logs/SKILL.md +0 -173
- package/runtime/skills/okstra-report-finder/SKILL.md +0 -111
- package/runtime/skills/okstra-status/SKILL.md +0 -246
- package/runtime/skills/okstra-time-summary/SKILL.md +0 -172
|
@@ -0,0 +1,1700 @@
|
|
|
1
|
+
# Implementation Lead Context Slimming Implementation Plan
|
|
2
|
+
|
|
3
|
+
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
|
4
|
+
|
|
5
|
+
**Goal:** `--task-type implementation` 의 Phase 6 (worker dispatch) 가 한 lead 세션 안에 끝나도록 lead context 누적 비용을 절감한다. 세 트랙(B/D/C) 을 통해 dispatch boilerplate, stage 실행 단위, profile baseline 무게를 동시에 줄인다.
|
|
6
|
+
|
|
7
|
+
**Architecture:**
|
|
8
|
+
- **Track B (cross-cutting)** — `okstra-team-contract` 가 worker prompt 본문에 verbatim inject 하라고 강제하는 `[Required reading]` (~48 lines) + `[Error reporting]` (~27 lines) clause 를 fragment file 1줄 pointer 로 압축한다. codex / gemini wrapper subagent 가 dispatch 직전 pointer 를 resolve 해서 외부 CLI 에 inline 한다 — lead context 는 pointer 만 통과한다.
|
|
9
|
+
- **Track D (implementation-only)** — `implementation-planning` profile 의 stage-당 step 권장값을 "최대 6" 에서 "권장 ≤ 3, hard cap 6" 으로 강화한다. executor sync block (=phase 6 단일 stage wall-clock) 의 평균 길이를 절반으로 자른다.
|
|
10
|
+
- **Track C (implementation-only, gated by B/D 실측)** — `prompts/profiles/implementation.md` 의 161 줄을 thin core (~50 줄) + 3 sidecar (`_implementation-executor.md` / `_implementation-verifier.md` / `_implementation-deliverable.md`) 로 split 한다. lead 는 Phase 5 진입 시 executor sidecar 만, Phase 5 verifier dispatch 직전 verifier sidecar 만, Phase 6 보고서 합성 직전 deliverable sidecar 만 lazy read 한다.
|
|
11
|
+
|
|
12
|
+
**Tech Stack:** Python 3 (`scripts/okstra_ctl/`), Bash (worker wrapper scripts), Markdown (profile / skill / template), pytest (unit tests under `tests/`).
|
|
13
|
+
|
|
14
|
+
---
|
|
15
|
+
|
|
16
|
+
## Pre-flight: 작업 트리 준비
|
|
17
|
+
|
|
18
|
+
이 plan 은 `prompts/`, `skills/`, `agents/`, `validators/`, `scripts/okstra_ctl/`, `templates/`, `tests/` 7개 디렉토리를 건드리며 빌드 산출 `runtime/` 은 자동 재생성 대상이다. **실행자는 plan 작성 위치(`main`)가 아니라 별도 worktree 에서 작업한다.**
|
|
19
|
+
|
|
20
|
+
- [ ] **사전 0: worktree 생성**
|
|
21
|
+
|
|
22
|
+
```bash
|
|
23
|
+
git worktree add -b feat/impl-lead-slim ../Okstra-impl-lead-slim main
|
|
24
|
+
cd ../Okstra-impl-lead-slim
|
|
25
|
+
```
|
|
26
|
+
|
|
27
|
+
이 plan 의 모든 후속 명령은 `../Okstra-impl-lead-slim` 을 cwd 로 가정한다.
|
|
28
|
+
|
|
29
|
+
- [ ] **사전 1: 환경 sanity check**
|
|
30
|
+
|
|
31
|
+
```bash
|
|
32
|
+
node --version # Node 20+
|
|
33
|
+
python3 --version # Python 3.11+
|
|
34
|
+
npm run build # runtime/ 재생성, 시작 baseline
|
|
35
|
+
python3 -m pytest tests/ -x -q
|
|
36
|
+
```
|
|
37
|
+
|
|
38
|
+
위 둘 다 통과해야 plan 진입. 실패하면 main 자체가 깨져 있으므로 root cause 부터 해결.
|
|
39
|
+
|
|
40
|
+
---
|
|
41
|
+
|
|
42
|
+
## File Structure
|
|
43
|
+
|
|
44
|
+
새로 만들거나 의미 있게 수정될 파일과 책임:
|
|
45
|
+
|
|
46
|
+
**Track B — Required reading / Error reporting fragment 분리**
|
|
47
|
+
|
|
48
|
+
| 종류 | 경로 | 책임 |
|
|
49
|
+
|------|------|------|
|
|
50
|
+
| Create | `prompts/fragments/required-reading-block.md` | `[Required reading]` clause 본문 SSOT (analysis worker / report-writer 공통, audience 토큰 치환 가능) |
|
|
51
|
+
| Create | `prompts/fragments/error-reporting-block.md` | `[Error reporting]` clause 본문 SSOT (`<role-slug>` 치환 가능) |
|
|
52
|
+
| Modify | `skills/okstra-team-contract/SKILL.md` | 두 clause 본문 verbatim → `Read <fragment-path>` pointer 룰로 교체. byte-identical 강제는 fragment 파일 자체에 위임 |
|
|
53
|
+
| Modify | `agents/workers/codex-worker.md` | 새 step "Resolve required-reading + error-reporting pointers, expand fragments into the CLI prompt" 추가 |
|
|
54
|
+
| Modify | `agents/workers/gemini-worker.md` | 동일 (codex-worker 와 byte-identical 변경) |
|
|
55
|
+
| Modify | `agents/SKILL.md` | dispatch prompt 작성 룰을 fragment pointer 로 변경. 기존 "Asymmetry between claude-worker and codex/gemini-worker prompts" 박스도 업데이트 |
|
|
56
|
+
| Modify | `tools/build.mjs` | `prompts/fragments/` 를 runtime sync 대상에 추가 |
|
|
57
|
+
| Create | `tests/test_required_reading_fragment.py` | fragment 파일 존재, 기존 inline body 와 byte-identical (diff =0) 보장 |
|
|
58
|
+
| Create | `tests/test_dispatch_prompt_pointer.py` | dispatch prompt 작성 helper 가 pointer 만 emit (verbatim body 포함 시 fail) |
|
|
59
|
+
|
|
60
|
+
**Track D — Stage step 권장값 강화**
|
|
61
|
+
|
|
62
|
+
| 종류 | 경로 | 책임 |
|
|
63
|
+
|------|------|------|
|
|
64
|
+
| Modify | `prompts/profiles/implementation-planning.md` | L64 `Effective row count ≤ 6` → `권장 ≤ 3, 절대 상한 6`. L95 Self-review step 7 도 동일 문구로. `Stage Map self-check` 단계에 "stage 가 4 step 이상이면 split 가능 여부 재검토" rationale 추가 |
|
|
65
|
+
| Modify | `templates/reports/implementation-planning-input.template.md` | 예시 Stage Map 표의 step-count 컬럼 값을 2 / 3 으로 갱신. "권장 ≤ 3" 문구를 라이터용 가이드 라인으로 노출 |
|
|
66
|
+
| Modify | `validators/validate-implementation-plan-stages.py` | `steps > 6` hard fail 유지 + `steps > 3` 시 warning 추가 (stderr 한 줄, exit code 영향 없음). warning emit 기준을 const 로 빼서 단위 테스트 가능하게 |
|
|
67
|
+
| Create | `tests/test_validate_implementation_plan_stages_warnings.py` | step=2/3 → warning 없음, step=4/5/6 → warning 있음, step=7 → 기존 error 유지 |
|
|
68
|
+
|
|
69
|
+
**Track C — implementation profile lazy split**
|
|
70
|
+
|
|
71
|
+
| 종류 | 경로 | 책임 |
|
|
72
|
+
|------|------|------|
|
|
73
|
+
| Create | `prompts/profiles/_implementation-executor.md` | Pre-implementation context exploration + Stage execution contract + Allowed actions (executor 측) + TDD loop. Phase 5 진입 시점 lazy load. INCLUDE 대상이 아니라 lead 가 직접 Read |
|
|
74
|
+
| Create | `prompts/profiles/_implementation-verifier.md` | Verifier QA duties (two-tier command lookup, deny-list, discrepancy rule, read-only command log) + verifier-specific forbidden actions. Phase 5 verifier dispatch 직전 lazy load |
|
|
75
|
+
| Create | `prompts/profiles/_implementation-deliverable.md` | Required deliverable shape + Lead post-stage persistence + Self-review pass + Routing recommendation. Phase 6 report-writer dispatch 직전 lazy load |
|
|
76
|
+
| Modify | `prompts/profiles/implementation.md` | thin core (~50 lines) 만 남김: Purpose, Required workers, Executor binding, `{{INCLUDE:_common-contract.md}}`, Pre-implementation gate, Task worktree, Forbidden actions (모든 phase 공통 위험), In-phase debugging, "Lazy section pointers" 박스 |
|
|
77
|
+
| Modify | `agents/SKILL.md` | 신규 박스 "Implementation profile lazy reading discipline" — Phase 5 진입 시 executor sidecar / Phase 5 verifier dispatch 시 verifier sidecar / Phase 6 시 deliverable sidecar 를 read. 미read 상태에서 phase 진입 시 BLOCKING |
|
|
78
|
+
| Modify | `validators/validate-run.py` | implementation profile 산출 보고서의 deliverable substring 검사가 sidecar split 후에도 작동하도록 — 현재 어떤 substring 을 검사하는지 확인 후 누락 없도록 갱신 |
|
|
79
|
+
| Create | `tests/test_implementation_profile_split.py` | (1) thin core + 3 sidecar 합집합이 기존 161-line implementation.md 와 의미 동등 (substring 손실 0). (2) thin core 만으로는 phase 5 진입 가능 / verifier dispatch 시 sidecar 미read → assert fail. (3) sidecar 파일 자체에 `{{INCLUDE:}}` 토큰 없음 (lead 가 직접 Read 하므로 expand 불필요) |
|
|
80
|
+
|
|
81
|
+
**모든 트랙 공통**
|
|
82
|
+
|
|
83
|
+
| 종류 | 경로 | 책임 |
|
|
84
|
+
|------|------|------|
|
|
85
|
+
| Modify | `CHANGES.md` | 3개 사용자 영향 entry (B / D / C), 각각 `사용자 영향:` 한 줄 포함 |
|
|
86
|
+
| Modify | `CLAUDE.md` "Where to find things" | fragment / sidecar 파일 위치 한 줄씩 추가 |
|
|
87
|
+
|
|
88
|
+
`runtime/` 은 build 산출이므로 직접 편집 금지. 각 트랙 종료 시 `npm run build` 한 번씩.
|
|
89
|
+
|
|
90
|
+
---
|
|
91
|
+
|
|
92
|
+
## Track B — Required reading / Error reporting boilerplate fragment 분리
|
|
93
|
+
|
|
94
|
+
**왜 먼저인가:** cross-cutting (모든 phase / 모든 worker dispatch). 작은 변경으로 가장 큰 누적 절감. Track D/C 와 직교라 충돌 없음.
|
|
95
|
+
|
|
96
|
+
### Task B1: `[Required reading]` fragment 파일 생성
|
|
97
|
+
|
|
98
|
+
**Files:**
|
|
99
|
+
- Create: `prompts/fragments/required-reading-block.md`
|
|
100
|
+
- Test: `tests/test_required_reading_fragment.py`
|
|
101
|
+
|
|
102
|
+
- [ ] **Step 1: 실패 테스트 작성 (fragment 존재 + content 검증)**
|
|
103
|
+
|
|
104
|
+
```python
|
|
105
|
+
# tests/test_required_reading_fragment.py
|
|
106
|
+
from pathlib import Path
|
|
107
|
+
|
|
108
|
+
REPO_ROOT = Path(__file__).resolve().parents[1]
|
|
109
|
+
FRAGMENT = REPO_ROOT / "prompts/fragments/required-reading-block.md"
|
|
110
|
+
TEAM_CONTRACT = REPO_ROOT / "skills/okstra-team-contract/SKILL.md"
|
|
111
|
+
|
|
112
|
+
|
|
113
|
+
def test_fragment_file_exists():
|
|
114
|
+
assert FRAGMENT.is_file(), f"missing fragment: {FRAGMENT}"
|
|
115
|
+
|
|
116
|
+
|
|
117
|
+
def test_fragment_starts_with_block_marker():
|
|
118
|
+
content = FRAGMENT.read_text(encoding="utf-8")
|
|
119
|
+
assert content.startswith("[Required reading]\n"), \
|
|
120
|
+
"fragment must start with '[Required reading]\\n' so it slots into worker prompt verbatim"
|
|
121
|
+
|
|
122
|
+
|
|
123
|
+
def test_fragment_contains_canonical_input_filenames():
|
|
124
|
+
# 본문 호환성 — 기존 team-contract 의 핵심 토큰이 모두 보존되어야 함
|
|
125
|
+
content = FRAGMENT.read_text(encoding="utf-8")
|
|
126
|
+
for token in [
|
|
127
|
+
"task-brief.md",
|
|
128
|
+
"analysis-profile.md",
|
|
129
|
+
"analysis-material.md",
|
|
130
|
+
"reference-expectations.md",
|
|
131
|
+
"clarification-response.md",
|
|
132
|
+
"final-report-template.md",
|
|
133
|
+
"REPORT WRITER ONLY",
|
|
134
|
+
"Reading Confirmation",
|
|
135
|
+
"audit sidecar",
|
|
136
|
+
]:
|
|
137
|
+
assert token in content, f"fragment lost canonical token: {token!r}"
|
|
138
|
+
```
|
|
139
|
+
|
|
140
|
+
- [ ] **Step 2: 테스트 실패 확인**
|
|
141
|
+
|
|
142
|
+
```bash
|
|
143
|
+
python3 -m pytest tests/test_required_reading_fragment.py -v
|
|
144
|
+
```
|
|
145
|
+
|
|
146
|
+
Expected: 3 failures — `FileNotFoundError` / `AssertionError`.
|
|
147
|
+
|
|
148
|
+
- [ ] **Step 3: fragment 파일 작성 (`skills/okstra-team-contract/SKILL.md` L111-159 본문을 그대로 옮긴다)**
|
|
149
|
+
|
|
150
|
+
```bash
|
|
151
|
+
mkdir -p prompts/fragments
|
|
152
|
+
```
|
|
153
|
+
|
|
154
|
+
```markdown
|
|
155
|
+
<!-- prompts/fragments/required-reading-block.md -->
|
|
156
|
+
[Required reading]
|
|
157
|
+
You are required to read every input file listed below from the very first
|
|
158
|
+
character to the very last character before you produce any analysis output.
|
|
159
|
+
Skimming, partial reads, jumping to a single section, or relying on prior
|
|
160
|
+
knowledge of a similar file's structure is not acceptable. Each file may
|
|
161
|
+
contain decisive context that is not surfaced in its summary or first page.
|
|
162
|
+
|
|
163
|
+
Required files for this run (read in this order, end-to-end, no exceptions):
|
|
164
|
+
|
|
165
|
+
1. <Project Root>/<instruction-set>/task-brief.md
|
|
166
|
+
2. <Project Root>/<instruction-set>/analysis-profile.md
|
|
167
|
+
3. <Project Root>/<instruction-set>/analysis-material.md (only if present)
|
|
168
|
+
4. <Project Root>/<instruction-set>/reference-expectations.md
|
|
169
|
+
5. <Project Root>/<instruction-set>/clarification-response.md (only if a carry-in was provided for this run)
|
|
170
|
+
6. <Project Root>/<instruction-set>/final-report-template.md (REPORT WRITER ONLY — omit for analysis workers and reverify dispatches)
|
|
171
|
+
|
|
172
|
+
Reading rules:
|
|
173
|
+
|
|
174
|
+
- Use a single Read tool call per file with no offset and no limit. Do not
|
|
175
|
+
page through the file with offset/limit unless the file is genuinely too
|
|
176
|
+
large for one read; if you must page, you MUST cover the entire file
|
|
177
|
+
before moving on, and you MUST state the page boundaries you used in your
|
|
178
|
+
Findings section.
|
|
179
|
+
- For the carry-in clarification response, read the conditional
|
|
180
|
+
`## 0. Clarification Response Carried In From Previous Run` section
|
|
181
|
+
(rendered only when carry-in is non-empty) and every row of
|
|
182
|
+
`## 5. Clarification Items` (`C-001`, `C-002`, ...) in full,
|
|
183
|
+
including rows whose `User input` cell is blank. The fact that you
|
|
184
|
+
will write your output into a file with a structurally similar
|
|
185
|
+
section 5 is NOT an excuse to skim — the prior `C-*` rows carry
|
|
186
|
+
context you cannot reconstruct from the new run alone.
|
|
187
|
+
- Write the Reading Confirmation block to your **audit sidecar** at
|
|
188
|
+
`runs/<task-type>/worker-results/<worker>-audit-<task-type>-<seq>.md`
|
|
189
|
+
(sibling to the main worker-results file). One short line per input
|
|
190
|
+
file confirming end-to-end reading, e.g. "Read task-brief.md
|
|
191
|
+
end-to-end (147 lines)." Do NOT include a `## 0. Reading Confirmation`
|
|
192
|
+
heading in the main worker-results file — the validator now fails
|
|
193
|
+
worker-results that contain one. If you cannot truthfully confirm a
|
|
194
|
+
file end-to-end, record a `tool-failure` in the errors sidecar
|
|
195
|
+
instead of fabricating Findings.
|
|
196
|
+
- Do not collapse multiple input files into a single mental summary before
|
|
197
|
+
reading them all individually. Each file has its own canonical role
|
|
198
|
+
(brief = the user's request, profile = the lead's rules for this phase,
|
|
199
|
+
reference-expectations = ground-truth config/deployment values,
|
|
200
|
+
clarification-response = prior run's open questions and the user's
|
|
201
|
+
answers, final-report-template = the structure your eventual writeup
|
|
202
|
+
must conform to). Conflating them loses signal.
|
|
203
|
+
```
|
|
204
|
+
|
|
205
|
+
- [ ] **Step 4: 테스트 통과 확인**
|
|
206
|
+
|
|
207
|
+
```bash
|
|
208
|
+
python3 -m pytest tests/test_required_reading_fragment.py -v
|
|
209
|
+
```
|
|
210
|
+
|
|
211
|
+
Expected: 3 PASS.
|
|
212
|
+
|
|
213
|
+
- [ ] **Step 5: 커밋**
|
|
214
|
+
|
|
215
|
+
```bash
|
|
216
|
+
git add prompts/fragments/required-reading-block.md tests/test_required_reading_fragment.py
|
|
217
|
+
git commit -m "feat(prompts/fragments): extract [Required reading] clause into SSOT fragment"
|
|
218
|
+
```
|
|
219
|
+
|
|
220
|
+
### Task B2: `[Error reporting]` fragment 파일 생성
|
|
221
|
+
|
|
222
|
+
**Files:**
|
|
223
|
+
- Create: `prompts/fragments/error-reporting-block.md`
|
|
224
|
+
- Test: `tests/test_error_reporting_fragment.py`
|
|
225
|
+
|
|
226
|
+
- [ ] **Step 1: 실패 테스트 작성**
|
|
227
|
+
|
|
228
|
+
```python
|
|
229
|
+
# tests/test_error_reporting_fragment.py
|
|
230
|
+
from pathlib import Path
|
|
231
|
+
|
|
232
|
+
REPO_ROOT = Path(__file__).resolve().parents[1]
|
|
233
|
+
FRAGMENT = REPO_ROOT / "prompts/fragments/error-reporting-block.md"
|
|
234
|
+
|
|
235
|
+
|
|
236
|
+
def test_fragment_file_exists():
|
|
237
|
+
assert FRAGMENT.is_file()
|
|
238
|
+
|
|
239
|
+
|
|
240
|
+
def test_fragment_starts_with_block_marker():
|
|
241
|
+
content = FRAGMENT.read_text(encoding="utf-8")
|
|
242
|
+
assert content.startswith("[Error reporting]\n")
|
|
243
|
+
|
|
244
|
+
|
|
245
|
+
def test_fragment_has_role_slug_placeholder():
|
|
246
|
+
content = FRAGMENT.read_text(encoding="utf-8")
|
|
247
|
+
assert "<role-slug>" in content, \
|
|
248
|
+
"fragment must keep <role-slug> placeholder so wrapper subagents substitute their own role"
|
|
249
|
+
|
|
250
|
+
|
|
251
|
+
def test_fragment_preserves_schema_keys():
|
|
252
|
+
content = FRAGMENT.read_text(encoding="utf-8")
|
|
253
|
+
for key in ["ts", "phase", "errorType", "tool-failure", "commandKind",
|
|
254
|
+
"exitCode", "durationMs", "stderrExcerpt"]:
|
|
255
|
+
assert key in content, f"schema key dropped: {key!r}"
|
|
256
|
+
```
|
|
257
|
+
|
|
258
|
+
- [ ] **Step 2: 테스트 실패 확인**
|
|
259
|
+
|
|
260
|
+
```bash
|
|
261
|
+
python3 -m pytest tests/test_error_reporting_fragment.py -v
|
|
262
|
+
```
|
|
263
|
+
|
|
264
|
+
Expected: 4 failures.
|
|
265
|
+
|
|
266
|
+
- [ ] **Step 3: fragment 작성 (`skills/okstra-team-contract/SKILL.md` L169-196 본문 그대로)**
|
|
267
|
+
|
|
268
|
+
```markdown
|
|
269
|
+
<!-- prompts/fragments/error-reporting-block.md -->
|
|
270
|
+
[Error reporting]
|
|
271
|
+
If any tool call you make (Bash, Read, Edit, MCP, etc.) returns a non-zero
|
|
272
|
+
exit code, raises an exception, or otherwise fails its intended effect,
|
|
273
|
+
append a single entry to your worker errors sidecar at:
|
|
274
|
+
|
|
275
|
+
runs/<task-type>/worker-results/<role-slug>-errors-<task-type>-<seq>.json
|
|
276
|
+
|
|
277
|
+
Schema (create the file with {"schemaVersion": 1, "errors": []} if absent):
|
|
278
|
+
|
|
279
|
+
{
|
|
280
|
+
"ts": "<ISO 8601 UTC>",
|
|
281
|
+
"phase": "<current okstra phase>",
|
|
282
|
+
"errorType": "tool-failure",
|
|
283
|
+
"command": "<failed command/tool signature>",
|
|
284
|
+
"commandKind": "bash | tool:Read | tool:Edit | mcp | ...",
|
|
285
|
+
"exitCode": <int or null>,
|
|
286
|
+
"durationMs": <int or null>,
|
|
287
|
+
"message": "<one-line human summary>",
|
|
288
|
+
"stderrExcerpt": "<first ~2KB of stderr, or null>",
|
|
289
|
+
"context": { ... or null }
|
|
290
|
+
}
|
|
291
|
+
|
|
292
|
+
Do NOT include source / recordedAt / agent / agentRole / model / taskKey —
|
|
293
|
+
Lead will fill those in. Do NOT use errorType values other than
|
|
294
|
+
"tool-failure" in the sidecar. Continue your task after recording; do not
|
|
295
|
+
abort unless the failure makes the task impossible.
|
|
296
|
+
```
|
|
297
|
+
|
|
298
|
+
- [ ] **Step 4: 테스트 통과 확인**
|
|
299
|
+
|
|
300
|
+
```bash
|
|
301
|
+
python3 -m pytest tests/test_error_reporting_fragment.py -v
|
|
302
|
+
```
|
|
303
|
+
|
|
304
|
+
Expected: 4 PASS.
|
|
305
|
+
|
|
306
|
+
- [ ] **Step 5: 커밋**
|
|
307
|
+
|
|
308
|
+
```bash
|
|
309
|
+
git add prompts/fragments/error-reporting-block.md tests/test_error_reporting_fragment.py
|
|
310
|
+
git commit -m "feat(prompts/fragments): extract [Error reporting] clause into SSOT fragment"
|
|
311
|
+
```
|
|
312
|
+
|
|
313
|
+
### Task B3: `okstra-team-contract` skill 본문 → pointer 룰로 교체
|
|
314
|
+
|
|
315
|
+
**Files:**
|
|
316
|
+
- Modify: `skills/okstra-team-contract/SKILL.md` (L86-159 + L161-199 두 블록)
|
|
317
|
+
- Test: 기존 `tests/` 풀 회귀로 검증
|
|
318
|
+
|
|
319
|
+
- [ ] **Step 1: 실패 테스트 작성 (team-contract 의 verbatim body 가 제거되었는지)**
|
|
320
|
+
|
|
321
|
+
```python
|
|
322
|
+
# tests/test_team_contract_uses_pointers.py
|
|
323
|
+
from pathlib import Path
|
|
324
|
+
|
|
325
|
+
REPO_ROOT = Path(__file__).resolve().parents[1]
|
|
326
|
+
TC = REPO_ROOT / "skills/okstra-team-contract/SKILL.md"
|
|
327
|
+
|
|
328
|
+
|
|
329
|
+
def test_required_reading_body_replaced_with_pointer():
|
|
330
|
+
content = TC.read_text(encoding="utf-8")
|
|
331
|
+
# 본문에 있던 verbatim 첫 줄이 더 이상 없어야 함
|
|
332
|
+
assert "Required files for this run (read in this order, end-to-end" not in content, \
|
|
333
|
+
"team-contract still inlines [Required reading] body; expected pointer only"
|
|
334
|
+
# 대신 fragment 경로 pointer 가 있어야 함
|
|
335
|
+
assert "prompts/fragments/required-reading-block.md" in content
|
|
336
|
+
|
|
337
|
+
|
|
338
|
+
def test_error_reporting_body_replaced_with_pointer():
|
|
339
|
+
content = TC.read_text(encoding="utf-8")
|
|
340
|
+
assert '"errorType": "tool-failure",' not in content, \
|
|
341
|
+
"team-contract still inlines [Error reporting] JSON schema; expected pointer only"
|
|
342
|
+
assert "prompts/fragments/error-reporting-block.md" in content
|
|
343
|
+
|
|
344
|
+
|
|
345
|
+
def test_pointer_rules_remain_blocking():
|
|
346
|
+
content = TC.read_text(encoding="utf-8")
|
|
347
|
+
# BLOCKING 마커는 그대로 살아 있어야 함 (룰 자체는 사라진 게 아님)
|
|
348
|
+
assert "BLOCKING" in content
|
|
349
|
+
assert "byte-identical" in content
|
|
350
|
+
```
|
|
351
|
+
|
|
352
|
+
- [ ] **Step 2: 테스트 실패 확인**
|
|
353
|
+
|
|
354
|
+
```bash
|
|
355
|
+
python3 -m pytest tests/test_team_contract_uses_pointers.py -v
|
|
356
|
+
```
|
|
357
|
+
|
|
358
|
+
Expected: 2 FAIL (verbatim body 가 아직 있음), 1 PASS.
|
|
359
|
+
|
|
360
|
+
- [ ] **Step 3: `Required reading clause` 섹션 교체**
|
|
361
|
+
|
|
362
|
+
`skills/okstra-team-contract/SKILL.md` L86-159 (Required reading clause 섹션 전체) 를 다음으로 교체:
|
|
363
|
+
|
|
364
|
+
```markdown
|
|
365
|
+
### Required reading clause (analysis workers + report-writer worker)
|
|
366
|
+
|
|
367
|
+
The clause body lives in `prompts/fragments/required-reading-block.md` (SSOT). Every analysis worker's prompt and the report-writer worker's prompt MUST embed that file's exact content verbatim — the lead's dispatch helper is responsible for the expansion (see `agents/workers/codex-worker.md` / `gemini-worker.md` "Resolve required-reading pointer" step for CLI workers; in-process `claude-worker` already mounts the same content through `## Required Reading Before Any Analysis` and does not need re-injection).
|
|
368
|
+
|
|
369
|
+
Replace placeholder file paths in the block with the actual project-relative paths derived from the run's `instruction-set/` and (if applicable) the carry-in clarification response.
|
|
370
|
+
|
|
371
|
+
**Audience-scoped enumeration (BLOCKING — performance optimization):**
|
|
372
|
+
|
|
373
|
+
Different recipients need different files. Do NOT include `final-report-template.md` in analysis worker prompts: analysis workers produce findings (not the final report), and forcing them to read the template inflates token usage without changing finding quality.
|
|
374
|
+
|
|
375
|
+
| Recipient | Files included in `[Required reading]` |
|
|
376
|
+
|---|---|
|
|
377
|
+
| Claude / Codex / Gemini analysis workers | task-brief, analysis-profile, analysis-material (if present), reference-expectations, clarification-response (if carry-in) |
|
|
378
|
+
| Report writer worker (Phase 6) | all of the above **plus** `final-report-template.md` |
|
|
379
|
+
| Reverify dispatches (Phase 5.5, lightweight mode) | **do NOT inject the `[Required reading]` clause at all** — see [okstra-convergence](../okstra-convergence/SKILL.md) "Reverify prompt: required-reading suppression". |
|
|
380
|
+
|
|
381
|
+
**Asymmetry between claude-worker and codex/gemini-worker prompts (NOT a bug):**
|
|
382
|
+
|
|
383
|
+
The dispatch prompt the lead constructs for `claude-worker` is intentionally shorter than for `codex-worker` / `gemini-worker`. Do NOT "fix" this by re-injecting `[Required reading]` / `[Error reporting]` / `[Output Contract]` blocks into the claude-worker prompt:
|
|
384
|
+
|
|
385
|
+
- `claude-worker` is an in-process Claude subagent. The Agent SDK auto-loads `agents/claude-worker.md`, which already contains `## Required Reading Before Any Analysis`, `## Worker Output Structure`, and `## Error reporting`. Re-injecting them is redundant and wastes tokens.
|
|
386
|
+
- `codex-worker` / `gemini-worker` shell out to a CLI. The CLI never sees the agent definition file — it only sees the prompt body passed via stdin. Therefore the lead's dispatch prompt for those workers MUST carry a single pointer line:
|
|
387
|
+
|
|
388
|
+
```
|
|
389
|
+
[Required reading] — fragment: prompts/fragments/required-reading-block.md (resolve before CLI launch)
|
|
390
|
+
```
|
|
391
|
+
|
|
392
|
+
The wrapper subagent (codex-worker / gemini-worker definition) is contractually required to read that fragment, perform the placeholder substitutions, and inline the resulting block into the CLI stdin payload before invoking `okstra-codex-exec.sh` / `okstra-gemini-exec.sh`. The lead MUST NOT inline the fragment body itself — the pointer-only form is what keeps lead context flat across runs.
|
|
393
|
+
|
|
394
|
+
Worker definition file size (claude-worker ~106 lines vs codex/gemini ~175–179 lines) is NOT evidence of incompleteness — it reflects the in-process vs CLI-wrapper distinction.
|
|
395
|
+
```
|
|
396
|
+
|
|
397
|
+
- [ ] **Step 4: `Error reporting clause` 섹션 교체**
|
|
398
|
+
|
|
399
|
+
`skills/okstra-team-contract/SKILL.md` L161-196 (`### Error reporting clause` 전체) 를 다음으로 교체:
|
|
400
|
+
|
|
401
|
+
```markdown
|
|
402
|
+
### Error reporting clause (analysis workers)
|
|
403
|
+
|
|
404
|
+
The clause body lives in `prompts/fragments/error-reporting-block.md` (SSOT). Every analysis worker (Claude / Codex / Gemini) receives the same content; the wrapper subagent for codex/gemini resolves and inlines the fragment into the CLI stdin payload at dispatch time, substituting `<role-slug>` with the receiving role (`claude-worker` / `codex-worker` / `gemini-worker`).
|
|
405
|
+
|
|
406
|
+
The lead's dispatch prompt for codex/gemini workers MUST carry a single pointer line:
|
|
407
|
+
|
|
408
|
+
```
|
|
409
|
+
[Error reporting] — fragment: prompts/fragments/error-reporting-block.md (resolve before CLI launch, substitute <role-slug>)
|
|
410
|
+
```
|
|
411
|
+
|
|
412
|
+
Subagent definitions for all three workers already carry an equivalent in-process contract (see each `agents/workers/*.md` "Error reporting" section). The fragment serves as a redundant safety net so any worker dispatched without its custom definition still receives identical instructions.
|
|
413
|
+
```
|
|
414
|
+
|
|
415
|
+
- [ ] **Step 5: 테스트 통과 확인**
|
|
416
|
+
|
|
417
|
+
```bash
|
|
418
|
+
python3 -m pytest tests/test_team_contract_uses_pointers.py -v
|
|
419
|
+
python3 -m pytest tests/ -x -q # 전체 회귀
|
|
420
|
+
```
|
|
421
|
+
|
|
422
|
+
Expected: 3 PASS. 회귀는 0 failure.
|
|
423
|
+
|
|
424
|
+
- [ ] **Step 6: 커밋**
|
|
425
|
+
|
|
426
|
+
```bash
|
|
427
|
+
git add skills/okstra-team-contract/SKILL.md tests/test_team_contract_uses_pointers.py
|
|
428
|
+
git commit -m "refactor(team-contract): replace [Required reading]/[Error reporting] body with fragment pointers"
|
|
429
|
+
```
|
|
430
|
+
|
|
431
|
+
### Task B4: codex-worker / gemini-worker wrapper 가 pointer 를 resolve 하는 step 추가
|
|
432
|
+
|
|
433
|
+
**Files:**
|
|
434
|
+
- Modify: `agents/workers/codex-worker.md` (Execution Rules 안에 step 추가)
|
|
435
|
+
- Modify: `agents/workers/gemini-worker.md` (byte-identical 변경)
|
|
436
|
+
- Test: `tests/test_dispatch_prompt_pointer.py`
|
|
437
|
+
|
|
438
|
+
- [ ] **Step 1: 실패 테스트 작성**
|
|
439
|
+
|
|
440
|
+
```python
|
|
441
|
+
# tests/test_dispatch_prompt_pointer.py
|
|
442
|
+
from pathlib import Path
|
|
443
|
+
import re
|
|
444
|
+
|
|
445
|
+
REPO_ROOT = Path(__file__).resolve().parents[1]
|
|
446
|
+
CODEX = REPO_ROOT / "agents/workers/codex-worker.md"
|
|
447
|
+
GEMINI = REPO_ROOT / "agents/workers/gemini-worker.md"
|
|
448
|
+
FRAGMENT_RR = REPO_ROOT / "prompts/fragments/required-reading-block.md"
|
|
449
|
+
FRAGMENT_ER = REPO_ROOT / "prompts/fragments/error-reporting-block.md"
|
|
450
|
+
|
|
451
|
+
|
|
452
|
+
def _expect_resolve_step(path: Path):
|
|
453
|
+
content = path.read_text(encoding="utf-8")
|
|
454
|
+
# 본문 안에 "Resolve required-reading pointer" 헤딩이 있고
|
|
455
|
+
assert re.search(r"^##\s+Resolve required-reading.*pointer", content, re.M), \
|
|
456
|
+
f"{path.name} missing 'Resolve required-reading pointer' section"
|
|
457
|
+
# fragment 경로 두 개를 모두 참조해야 함
|
|
458
|
+
assert "prompts/fragments/required-reading-block.md" in content
|
|
459
|
+
assert "prompts/fragments/error-reporting-block.md" in content
|
|
460
|
+
# 그리고 절대 verbatim body 를 inline 하면 안 됨
|
|
461
|
+
assert "Required files for this run (read in this order" not in content, \
|
|
462
|
+
f"{path.name} inlines fragment body; pointer-resolve step must NOT duplicate it"
|
|
463
|
+
|
|
464
|
+
|
|
465
|
+
def test_codex_worker_has_resolve_step():
|
|
466
|
+
_expect_resolve_step(CODEX)
|
|
467
|
+
|
|
468
|
+
|
|
469
|
+
def test_gemini_worker_has_resolve_step():
|
|
470
|
+
_expect_resolve_step(GEMINI)
|
|
471
|
+
|
|
472
|
+
|
|
473
|
+
def test_codex_and_gemini_resolve_steps_are_byte_identical():
|
|
474
|
+
"""codex / gemini 워커의 Resolve 섹션 본문은 byte-identical 이어야 한다 (role 이름만 다름)."""
|
|
475
|
+
def _extract(path: Path) -> str:
|
|
476
|
+
content = path.read_text(encoding="utf-8")
|
|
477
|
+
m = re.search(
|
|
478
|
+
r"^##\s+Resolve required-reading.*?(?=^##\s)",
|
|
479
|
+
content, re.M | re.S,
|
|
480
|
+
)
|
|
481
|
+
assert m, f"{path.name} missing Resolve section"
|
|
482
|
+
body = m.group(0)
|
|
483
|
+
# role 이름만 다른 부분은 normalize
|
|
484
|
+
body = body.replace("codex-worker", "<ROLE>").replace("gemini-worker", "<ROLE>")
|
|
485
|
+
body = body.replace("Codex", "<ROLE_TITLE>").replace("Gemini", "<ROLE_TITLE>")
|
|
486
|
+
body = body.replace("okstra-codex-exec.sh", "<EXEC_SH>").replace("okstra-gemini-exec.sh", "<EXEC_SH>")
|
|
487
|
+
return body
|
|
488
|
+
|
|
489
|
+
assert _extract(CODEX) == _extract(GEMINI), \
|
|
490
|
+
"codex/gemini Resolve sections diverged — keep them byte-identical modulo role tokens"
|
|
491
|
+
```
|
|
492
|
+
|
|
493
|
+
- [ ] **Step 2: 테스트 실패 확인**
|
|
494
|
+
|
|
495
|
+
```bash
|
|
496
|
+
python3 -m pytest tests/test_dispatch_prompt_pointer.py -v
|
|
497
|
+
```
|
|
498
|
+
|
|
499
|
+
Expected: 3 FAIL.
|
|
500
|
+
|
|
501
|
+
- [ ] **Step 3: codex-worker.md 에 `## Resolve required-reading + error-reporting pointers` 섹션 추가**
|
|
502
|
+
|
|
503
|
+
`agents/workers/codex-worker.md` 의 기존 "Execution Rules" 섹션 직후에 다음을 삽입한다 (정확한 위치는 codex-worker.md L 의 첫 `## ` heading 바로 위; existing prompt persist step 직후가 자연스러움):
|
|
504
|
+
|
|
505
|
+
```markdown
|
|
506
|
+
## Resolve required-reading + error-reporting pointers
|
|
507
|
+
|
|
508
|
+
Before launching `okstra-codex-exec.sh`, the lead's dispatch prompt arrives with two pointer lines instead of the legacy verbatim blocks:
|
|
509
|
+
|
|
510
|
+
```
|
|
511
|
+
[Required reading] — fragment: prompts/fragments/required-reading-block.md (resolve before CLI launch)
|
|
512
|
+
[Error reporting] — fragment: prompts/fragments/error-reporting-block.md (resolve before CLI launch, substitute <role-slug>)
|
|
513
|
+
```
|
|
514
|
+
|
|
515
|
+
This wrapper subagent MUST expand both pointers in-process BEFORE writing the prompt history file and BEFORE shelling out to the CLI:
|
|
516
|
+
|
|
517
|
+
1. `Read` the file at `<Project Root>/prompts/fragments/required-reading-block.md` (whole file, single Read call, no offset/limit). If the file is missing, return `CODEX_REQUIRED_READING_FRAGMENT_MISSING` without proceeding.
|
|
518
|
+
2. Substitute the placeholders in the fragment with the actual run-specific paths the lead supplied in its dispatch prompt body (`<Project Root>`, `<instruction-set>` segments, and the conditional carry-in line). Omit row 6 (`final-report-template.md`) — this worker is NOT the report-writer.
|
|
519
|
+
3. `Read` the file at `<Project Root>/prompts/fragments/error-reporting-block.md` and substitute `<role-slug>` with `codex-worker`.
|
|
520
|
+
4. Replace the two pointer lines in the dispatch prompt body with the expanded blocks (Required reading first, Error reporting next, in their original order).
|
|
521
|
+
5. Persist the **expanded** prompt to the assigned prompt history path — this is what gets piped into the CLI and what auditors will read later. The pointer-only form must not survive into the history file.
|
|
522
|
+
6. Only then invoke `okstra-codex-exec.sh` with the expanded prompt path.
|
|
523
|
+
|
|
524
|
+
Rationale: lead carries the pointer (~2 lines) in its own context instead of the full ~75-line clause body, so the lead's per-dispatch context cost drops by ~95% for these two clauses. The wrapper subagent's own context picks up the expansion cost (~2KB once per dispatch), but the wrapper is a fresh subagent per worker — that cost does not accumulate across dispatches the way lead context does.
|
|
525
|
+
|
|
526
|
+
If either fragment file is missing on disk, abort with the `_FRAGMENT_MISSING` sentinel above — do NOT fall back to inlining a remembered version of the body. The fragment files are the SSOT; stale in-memory copies are a contract violation.
|
|
527
|
+
```
|
|
528
|
+
|
|
529
|
+
- [ ] **Step 4: gemini-worker.md 에 동일한 섹션을 byte-identical 로 삽입**
|
|
530
|
+
|
|
531
|
+
`agents/workers/gemini-worker.md` 에 같은 위치 (Execution Rules 직후) 에 같은 블록을 삽입하되, 다음 토큰만 치환:
|
|
532
|
+
|
|
533
|
+
- `okstra-codex-exec.sh` → `okstra-gemini-exec.sh`
|
|
534
|
+
- `CODEX_REQUIRED_READING_FRAGMENT_MISSING` → `GEMINI_REQUIRED_READING_FRAGMENT_MISSING`
|
|
535
|
+
- `codex-worker` (role-slug) → `gemini-worker`
|
|
536
|
+
|
|
537
|
+
다른 모든 줄은 그대로.
|
|
538
|
+
|
|
539
|
+
- [ ] **Step 5: 테스트 통과 확인**
|
|
540
|
+
|
|
541
|
+
```bash
|
|
542
|
+
python3 -m pytest tests/test_dispatch_prompt_pointer.py -v
|
|
543
|
+
```
|
|
544
|
+
|
|
545
|
+
Expected: 3 PASS.
|
|
546
|
+
|
|
547
|
+
- [ ] **Step 6: 커밋**
|
|
548
|
+
|
|
549
|
+
```bash
|
|
550
|
+
git add agents/workers/codex-worker.md agents/workers/gemini-worker.md tests/test_dispatch_prompt_pointer.py
|
|
551
|
+
git commit -m "feat(workers/codex,gemini): resolve required-reading + error-reporting pointers in wrapper subagent"
|
|
552
|
+
```
|
|
553
|
+
|
|
554
|
+
### Task B5: lead 의 dispatch 룰 갱신 (`agents/SKILL.md`)
|
|
555
|
+
|
|
556
|
+
**Files:**
|
|
557
|
+
- Modify: `agents/SKILL.md` (Phase 2-5 박스 + Anti-patterns 표)
|
|
558
|
+
|
|
559
|
+
- [ ] **Step 1: 실패 테스트 작성**
|
|
560
|
+
|
|
561
|
+
```python
|
|
562
|
+
# tests/test_lead_dispatch_uses_pointer.py
|
|
563
|
+
from pathlib import Path
|
|
564
|
+
import re
|
|
565
|
+
|
|
566
|
+
REPO_ROOT = Path(__file__).resolve().parents[1]
|
|
567
|
+
LEAD = REPO_ROOT / "agents/SKILL.md"
|
|
568
|
+
|
|
569
|
+
|
|
570
|
+
def test_lead_skill_documents_pointer_only_dispatch():
|
|
571
|
+
content = LEAD.read_text(encoding="utf-8")
|
|
572
|
+
assert "fragment: prompts/fragments/required-reading-block.md" in content, \
|
|
573
|
+
"lead skill must document the pointer-only dispatch form"
|
|
574
|
+
|
|
575
|
+
|
|
576
|
+
def test_lead_skill_forbids_inlining_fragment_body():
|
|
577
|
+
content = LEAD.read_text(encoding="utf-8")
|
|
578
|
+
# 룰 박스에 inline 금지 라인이 명시되어야 함
|
|
579
|
+
assert re.search(
|
|
580
|
+
r"Inlining (the )?\[Required reading\] (or \[Error reporting\] )?fragment body",
|
|
581
|
+
content,
|
|
582
|
+
), "lead skill must forbid inlining fragment body in dispatch prompts"
|
|
583
|
+
```
|
|
584
|
+
|
|
585
|
+
- [ ] **Step 2: 테스트 실패 확인**
|
|
586
|
+
|
|
587
|
+
```bash
|
|
588
|
+
python3 -m pytest tests/test_lead_dispatch_uses_pointer.py -v
|
|
589
|
+
```
|
|
590
|
+
|
|
591
|
+
Expected: 2 FAIL.
|
|
592
|
+
|
|
593
|
+
- [ ] **Step 3: `agents/SKILL.md` Phase 2-5 박스 갱신**
|
|
594
|
+
|
|
595
|
+
`agents/SKILL.md` L177-184 부근 ("Phase 2 — Phase 5: Prompt preparation..." 박스) 의 `- The [Required reading] clause (audience-scoped enumeration...)` 와 `- The [Error reporting] clause and the asymmetry...` 두 bullet 을 다음으로 교체:
|
|
596
|
+
|
|
597
|
+
```markdown
|
|
598
|
+
- The `[Required reading]` clause — for in-process `claude-worker`, the agent definition (`agents/workers/claude-worker.md` "## Required Reading Before Any Analysis") owns the body. For CLI-wrapper workers (`codex-worker` / `gemini-worker`), the lead's dispatch prompt carries ONLY the pointer line `[Required reading] — fragment: prompts/fragments/required-reading-block.md (resolve before CLI launch)`. The wrapper subagent reads the fragment, performs placeholder substitution, and inlines the result into the CLI stdin payload. Lead MUST NOT embed the fragment body itself — the pointer-only form is what keeps lead context flat across runs.
|
|
599
|
+
- The `[Error reporting]` clause — same pointer-only pattern. Lead emits `[Error reporting] — fragment: prompts/fragments/error-reporting-block.md (resolve before CLI launch, substitute <role-slug>)`. Wrapper subagent expands; lead does not.
|
|
600
|
+
- Audience-scoped enumeration (analysis workers vs report-writer vs reverify dispatches) and the asymmetry between in-process vs CLI workers remain as documented in [okstra-team-contract](./skills/okstra-team-contract/SKILL.md).
|
|
601
|
+
```
|
|
602
|
+
|
|
603
|
+
- [ ] **Step 4: Anti-patterns 표에 새 행 추가**
|
|
604
|
+
|
|
605
|
+
`agents/SKILL.md` 의 anti-patterns 표 (L340-356 부근) 끝에 두 행 추가:
|
|
606
|
+
|
|
607
|
+
```markdown
|
|
608
|
+
| Inlining the `[Required reading]` fragment body into the lead's dispatch prompt for codex/gemini workers | Use the pointer-only form per [okstra-team-contract](./skills/okstra-team-contract/SKILL.md) "Required reading clause" — wrapper subagent resolves the fragment, lead does not. Inline form re-enters the lead's context on every cache miss and is the original cause of the lead-context bloat this rule fixes |
|
|
609
|
+
| Inlining the `[Error reporting]` fragment body into the dispatch prompt | Same as above — pointer-only at the lead layer, expansion at the wrapper layer |
|
|
610
|
+
```
|
|
611
|
+
|
|
612
|
+
- [ ] **Step 5: 테스트 통과 확인**
|
|
613
|
+
|
|
614
|
+
```bash
|
|
615
|
+
python3 -m pytest tests/test_lead_dispatch_uses_pointer.py -v
|
|
616
|
+
```
|
|
617
|
+
|
|
618
|
+
Expected: 2 PASS.
|
|
619
|
+
|
|
620
|
+
- [ ] **Step 6: 커밋**
|
|
621
|
+
|
|
622
|
+
```bash
|
|
623
|
+
git add agents/SKILL.md tests/test_lead_dispatch_uses_pointer.py
|
|
624
|
+
git commit -m "docs(agents/SKILL): document pointer-only dispatch form for required-reading/error-reporting"
|
|
625
|
+
```
|
|
626
|
+
|
|
627
|
+
### Task B6: build pipeline 에 `prompts/fragments/` 추가
|
|
628
|
+
|
|
629
|
+
**Files:**
|
|
630
|
+
- Modify: `tools/build.mjs`
|
|
631
|
+
|
|
632
|
+
- [ ] **Step 1: 현재 sync 대상 확인**
|
|
633
|
+
|
|
634
|
+
```bash
|
|
635
|
+
grep -n "prompts\|fragments" tools/build.mjs
|
|
636
|
+
```
|
|
637
|
+
|
|
638
|
+
`tools/build.mjs` 에서 `prompts/` 디렉토리를 rsync 하는 부분을 찾는다 (이미 sync 됨이면 이 task 는 no-op).
|
|
639
|
+
|
|
640
|
+
- [ ] **Step 2: 실패 테스트 작성**
|
|
641
|
+
|
|
642
|
+
```python
|
|
643
|
+
# tests/test_build_includes_fragments.py
|
|
644
|
+
import subprocess
|
|
645
|
+
from pathlib import Path
|
|
646
|
+
|
|
647
|
+
REPO_ROOT = Path(__file__).resolve().parents[1]
|
|
648
|
+
|
|
649
|
+
|
|
650
|
+
def test_build_emits_fragments_into_runtime():
|
|
651
|
+
subprocess.run(["npm", "run", "build"], check=True, cwd=REPO_ROOT, capture_output=True)
|
|
652
|
+
rr = REPO_ROOT / "runtime/prompts/fragments/required-reading-block.md"
|
|
653
|
+
er = REPO_ROOT / "runtime/prompts/fragments/error-reporting-block.md"
|
|
654
|
+
assert rr.is_file(), f"build did not emit {rr}"
|
|
655
|
+
assert er.is_file(), f"build did not emit {er}"
|
|
656
|
+
```
|
|
657
|
+
|
|
658
|
+
- [ ] **Step 3: 테스트 실행 — 이미 통과하면 Step 5 로 점프**
|
|
659
|
+
|
|
660
|
+
```bash
|
|
661
|
+
python3 -m pytest tests/test_build_includes_fragments.py -v
|
|
662
|
+
```
|
|
663
|
+
|
|
664
|
+
`prompts/**/*` rsync glob 이 이미 fragments 를 잡으면 PASS. 아니면 FAIL.
|
|
665
|
+
|
|
666
|
+
- [ ] **Step 4: (FAIL 인 경우만) `tools/build.mjs` 에 fragments 경로 추가**
|
|
667
|
+
|
|
668
|
+
`tools/build.mjs` 의 prompts sync 블록에서 디렉토리 glob 이 fragments 를 포함하도록 수정. 이미 `prompts/**/*` 형태면 skip; `prompts/profiles/*.md` 처럼 특정 서브디렉토리만 잡으면 `prompts/fragments/*.md` 도 추가.
|
|
669
|
+
|
|
670
|
+
```javascript
|
|
671
|
+
// 예: 기존
|
|
672
|
+
// await rsync("prompts/profiles/", "runtime/prompts/profiles/");
|
|
673
|
+
// 변경
|
|
674
|
+
await rsync("prompts/profiles/", "runtime/prompts/profiles/");
|
|
675
|
+
await rsync("prompts/fragments/", "runtime/prompts/fragments/");
|
|
676
|
+
```
|
|
677
|
+
|
|
678
|
+
- [ ] **Step 5: 테스트 통과 확인**
|
|
679
|
+
|
|
680
|
+
```bash
|
|
681
|
+
python3 -m pytest tests/test_build_includes_fragments.py -v
|
|
682
|
+
```
|
|
683
|
+
|
|
684
|
+
Expected: PASS.
|
|
685
|
+
|
|
686
|
+
- [ ] **Step 6: 커밋**
|
|
687
|
+
|
|
688
|
+
```bash
|
|
689
|
+
git add tools/build.mjs tests/test_build_includes_fragments.py
|
|
690
|
+
git commit -m "build: include prompts/fragments/ in runtime sync"
|
|
691
|
+
```
|
|
692
|
+
|
|
693
|
+
### Task B7: Track B 회귀 + e2e + CHANGES.md
|
|
694
|
+
|
|
695
|
+
- [ ] **Step 1: 전체 회귀 + e2e 한 시나리오**
|
|
696
|
+
|
|
697
|
+
```bash
|
|
698
|
+
python3 -m pytest tests/ -x -q
|
|
699
|
+
bash validators/validate-workflow.sh
|
|
700
|
+
bash tests-e2e/scenario-01-record-start-reconcile.sh
|
|
701
|
+
```
|
|
702
|
+
|
|
703
|
+
Expected: 모두 PASS.
|
|
704
|
+
|
|
705
|
+
- [ ] **Step 2: `CHANGES.md` 에 사용자 영향 entry 추가**
|
|
706
|
+
|
|
707
|
+
`CHANGES.md` 최상단 (가장 최근 entry 위) 에 추가:
|
|
708
|
+
|
|
709
|
+
```markdown
|
|
710
|
+
### refactor(team-contract,workers): [Required reading]/[Error reporting] 본문을 fragment pointer 로 압축
|
|
711
|
+
|
|
712
|
+
- **배경**: 매 worker dispatch 마다 lead 가 직접 작성하던 `[Required reading]` (~48 줄) + `[Error reporting]` (~27 줄) verbatim body 가 lead context baseline 의 누적 비용 주범이었다. 3 worker × 다단계 phase × 매 turn cache_read = lead totalTokens 의 큰 비중을 차지.
|
|
713
|
+
- **변경**:
|
|
714
|
+
- 두 clause body 를 `prompts/fragments/required-reading-block.md` / `prompts/fragments/error-reporting-block.md` SSOT 파일로 추출.
|
|
715
|
+
- `skills/okstra-team-contract/SKILL.md` 의 본문 verbatim 룰을 pointer 룰로 교체 — lead 의 dispatch prompt 는 `[Required reading] — fragment: <path>` 한 줄만 보낸다.
|
|
716
|
+
- `agents/workers/codex-worker.md` / `gemini-worker.md` 에 `## Resolve required-reading + error-reporting pointers` 섹션 추가 — wrapper subagent 가 fragment 를 read → placeholder 치환 → CLI stdin 에 inline.
|
|
717
|
+
- `agents/SKILL.md` 의 Phase 2-5 박스 + Anti-patterns 표 업데이트.
|
|
718
|
+
- **사용자 영향**: 다음 release 후 `npx -y okstra@latest install` 한 번이면 전파. lead 가 codex/gemini dispatch 당 약 75 줄 (~3KB) 의 body 를 더 이상 자체 context 에 담지 않는다. 같은 phase 안에서 dispatch 가 N 회 일어날 때 누적 절감은 N × 75 lines. wrapper subagent 의 context 는 dispatch 당 1회만 fragment 를 흡수하므로 lead-side 와 달리 누적되지 않는다. **호환성 영향**: 기존 prompt-history 파일은 expanded body 그대로 — auditor 가 읽는 형식은 변하지 않는다. lead 의 in-memory prompt 만 pointer-only.
|
|
719
|
+
```
|
|
720
|
+
|
|
721
|
+
- [ ] **Step 3: 커밋**
|
|
722
|
+
|
|
723
|
+
```bash
|
|
724
|
+
git add CHANGES.md
|
|
725
|
+
git commit -m "docs(changes): record Required reading/Error reporting pointer compression"
|
|
726
|
+
```
|
|
727
|
+
|
|
728
|
+
---
|
|
729
|
+
|
|
730
|
+
## Track D — Stage step 권장값 강화
|
|
731
|
+
|
|
732
|
+
**Why now:** B 이후. cross-cutting 영향 없음. validator 의 hard cap 은 그대로 두고 권장값(soft) 만 ≤3 으로 강화 → planning 단계 부담은 작고, executor sync block 평균 길이는 절반.
|
|
733
|
+
|
|
734
|
+
### Task D1: `implementation-planning.md` profile 본문 수정
|
|
735
|
+
|
|
736
|
+
**Files:**
|
|
737
|
+
- Modify: `prompts/profiles/implementation-planning.md` (L64 + L95)
|
|
738
|
+
|
|
739
|
+
- [ ] **Step 1: 실패 테스트 작성**
|
|
740
|
+
|
|
741
|
+
```python
|
|
742
|
+
# tests/test_implementation_planning_step_guidance.py
|
|
743
|
+
from pathlib import Path
|
|
744
|
+
|
|
745
|
+
REPO_ROOT = Path(__file__).resolve().parents[1]
|
|
746
|
+
PROFILE = REPO_ROOT / "prompts/profiles/implementation-planning.md"
|
|
747
|
+
|
|
748
|
+
|
|
749
|
+
def test_step_count_recommendation_is_three():
|
|
750
|
+
content = PROFILE.read_text(encoding="utf-8")
|
|
751
|
+
# 새 권장값 문구
|
|
752
|
+
assert "권장 ≤ 3" in content or "recommended ≤ 3" in content, \
|
|
753
|
+
"implementation-planning profile must recommend ≤3 steps per stage"
|
|
754
|
+
# hard cap 6 은 유지
|
|
755
|
+
assert "절대 상한 6" in content or "hard cap 6" in content
|
|
756
|
+
|
|
757
|
+
|
|
758
|
+
def test_self_review_step_seven_mentions_three():
|
|
759
|
+
content = PROFILE.read_text(encoding="utf-8")
|
|
760
|
+
# Self-review pass step 7 (Stage Map self-check) 안에 ≤3 권장이 들어가야 함
|
|
761
|
+
self_review = content[content.index("Stage Map self-check"):]
|
|
762
|
+
assert "≤ 3" in self_review[:1500] or "≤3" in self_review[:1500], \
|
|
763
|
+
"Stage Map self-check must call out the ≤3 recommendation"
|
|
764
|
+
```
|
|
765
|
+
|
|
766
|
+
- [ ] **Step 2: 테스트 실패 확인**
|
|
767
|
+
|
|
768
|
+
```bash
|
|
769
|
+
python3 -m pytest tests/test_implementation_planning_step_guidance.py -v
|
|
770
|
+
```
|
|
771
|
+
|
|
772
|
+
Expected: 2 FAIL.
|
|
773
|
+
|
|
774
|
+
- [ ] **Step 3: L64 (Stepwise Execution Order bullet) 수정**
|
|
775
|
+
|
|
776
|
+
`prompts/profiles/implementation-planning.md` 의 다음 줄을:
|
|
777
|
+
|
|
778
|
+
```markdown
|
|
779
|
+
- `### Stepwise Execution Order` — bite-sized table with `step | action | files | command | expected`. **Effective row count ≤ 6** (excluding header / divider / blank). Each step is one action completable in 2–5 minutes; for code steps include actual code or diff sketch; prefer TDD ordering (failing test → implementation → green → commit).
|
|
780
|
+
```
|
|
781
|
+
|
|
782
|
+
다음으로 교체:
|
|
783
|
+
|
|
784
|
+
```markdown
|
|
785
|
+
- `### Stepwise Execution Order` — bite-sized table with `step | action | files | command | expected`. **Effective row count 권장 ≤ 3, 절대 상한 6** (excluding header / divider / blank). 권장값을 초과하면 stage 를 더 잘게 partition 할 수 있는지 먼저 검토하고, partition 이 정말 불가능할 때만 4-6 step 을 유지한다 — executor 1 회 sync block 의 wall-clock 이 stage 길이에 비례하므로 짧은 stage 가 implementation lead 의 single-session 처리 가능성을 높인다. Each step is one action completable in 2–5 minutes; for code steps include actual code or diff sketch; prefer TDD ordering (failing test → implementation → green → commit).
|
|
786
|
+
```
|
|
787
|
+
|
|
788
|
+
- [ ] **Step 4: L95 (Self-review pass step 7) 수정**
|
|
789
|
+
|
|
790
|
+
다음 줄을:
|
|
791
|
+
|
|
792
|
+
```markdown
|
|
793
|
+
7. **Stage Map self-check** — for every stage, count the effective rows of its `Stepwise Execution Order` table by hand; reject the draft if any stage exceeds 6. Walk the `depends-on` graph and confirm it is a DAG (no cycle, no self-reference). For each `depends-on` link, ask "can this be removed by re-partitioning?" — if yes, re-partition and re-count.
|
|
794
|
+
```
|
|
795
|
+
|
|
796
|
+
다음으로 교체:
|
|
797
|
+
|
|
798
|
+
```markdown
|
|
799
|
+
7. **Stage Map self-check** — for every stage, count the effective rows of its `Stepwise Execution Order` table by hand; reject the draft if any stage exceeds the hard cap 6. **For every stage with 4 step 이상, 명시적으로 "이 stage 를 2 개 이상으로 split 할 수 있는가?" 를 자문하고 그 결정 근거(예: "step 3 과 step 4 의 산출물이 동일 commit 에 묶여야 한다") 를 plan 본문에 한 줄로 기록한다** — 4-6 step stage 는 권장값(≤ 3) 위반이므로 정당화 없이 통과해서는 안 된다. Walk the `depends-on` graph and confirm it is a DAG (no cycle, no self-reference). For each `depends-on` link, ask "can this be removed by re-partitioning?" — if yes, re-partition and re-count.
|
|
800
|
+
```
|
|
801
|
+
|
|
802
|
+
- [ ] **Step 5: 테스트 통과 확인**
|
|
803
|
+
|
|
804
|
+
```bash
|
|
805
|
+
python3 -m pytest tests/test_implementation_planning_step_guidance.py -v
|
|
806
|
+
```
|
|
807
|
+
|
|
808
|
+
Expected: 2 PASS.
|
|
809
|
+
|
|
810
|
+
- [ ] **Step 6: 커밋**
|
|
811
|
+
|
|
812
|
+
```bash
|
|
813
|
+
git add prompts/profiles/implementation-planning.md tests/test_implementation_planning_step_guidance.py
|
|
814
|
+
git commit -m "feat(planning): recommend ≤3 steps per stage; require justification when 4-6"
|
|
815
|
+
```
|
|
816
|
+
|
|
817
|
+
### Task D2: validator 에 step >3 warning 추가
|
|
818
|
+
|
|
819
|
+
**Files:**
|
|
820
|
+
- Modify: `validators/validate-implementation-plan-stages.py`
|
|
821
|
+
- Test: `tests/test_validate_implementation_plan_stages_warnings.py`
|
|
822
|
+
|
|
823
|
+
- [ ] **Step 1: 실패 테스트 작성**
|
|
824
|
+
|
|
825
|
+
```python
|
|
826
|
+
# tests/test_validate_implementation_plan_stages_warnings.py
|
|
827
|
+
"""validate-implementation-plan-stages.py 의 새 warning 동작 검증.
|
|
828
|
+
|
|
829
|
+
목표:
|
|
830
|
+
- steps ∈ {2, 3} → warning 0건, exit 0
|
|
831
|
+
- steps ∈ {4, 5, 6} → warning 1건 per stage, exit 0 (hard cap 유지)
|
|
832
|
+
- steps == 7 → 기존 S5 error, exit non-zero
|
|
833
|
+
"""
|
|
834
|
+
import subprocess
|
|
835
|
+
import sys
|
|
836
|
+
import textwrap
|
|
837
|
+
from pathlib import Path
|
|
838
|
+
|
|
839
|
+
REPO_ROOT = Path(__file__).resolve().parents[1]
|
|
840
|
+
SCRIPT = REPO_ROOT / "validators/validate-implementation-plan-stages.py"
|
|
841
|
+
|
|
842
|
+
|
|
843
|
+
def _plan_with_steps(step_count: int) -> str:
|
|
844
|
+
"""`step_count` 개의 effective row 를 가진 single-stage plan 본문."""
|
|
845
|
+
rows = "\n".join(
|
|
846
|
+
f"| {i+1} | action-{i+1} | file.py | cmd | expected |" for i in range(step_count)
|
|
847
|
+
)
|
|
848
|
+
return textwrap.dedent(f"""\
|
|
849
|
+
## 4.5 Stage Map
|
|
850
|
+
|
|
851
|
+
| stage | title | depends-on | step-count | exit-contract-summary |
|
|
852
|
+
|-------|-------|------------|------------|-----------------------|
|
|
853
|
+
| 1 | Solo | (none) | {step_count} | exit |
|
|
854
|
+
|
|
855
|
+
## 4.5.1 Stage 1: Solo
|
|
856
|
+
|
|
857
|
+
### Carry-In
|
|
858
|
+
task-brief only.
|
|
859
|
+
|
|
860
|
+
### Stepwise Execution Order
|
|
861
|
+
| step | action | files | command | expected |
|
|
862
|
+
|------|--------|-------|---------|----------|
|
|
863
|
+
{rows}
|
|
864
|
+
|
|
865
|
+
### Stage Exit Contract
|
|
866
|
+
exit details.
|
|
867
|
+
|
|
868
|
+
### Stage Validation
|
|
869
|
+
- pre: noop
|
|
870
|
+
- post: noop
|
|
871
|
+
""")
|
|
872
|
+
|
|
873
|
+
|
|
874
|
+
def _run(plan_text: str, tmp_path: Path) -> subprocess.CompletedProcess:
|
|
875
|
+
plan = tmp_path / "plan.md"
|
|
876
|
+
plan.write_text(plan_text, encoding="utf-8")
|
|
877
|
+
return subprocess.run(
|
|
878
|
+
[sys.executable, str(SCRIPT), str(plan)],
|
|
879
|
+
capture_output=True, text=True,
|
|
880
|
+
)
|
|
881
|
+
|
|
882
|
+
|
|
883
|
+
def test_two_steps_no_warning(tmp_path):
|
|
884
|
+
r = _run(_plan_with_steps(2), tmp_path)
|
|
885
|
+
assert r.returncode == 0
|
|
886
|
+
assert "WARN" not in r.stderr
|
|
887
|
+
|
|
888
|
+
|
|
889
|
+
def test_three_steps_no_warning(tmp_path):
|
|
890
|
+
r = _run(_plan_with_steps(3), tmp_path)
|
|
891
|
+
assert r.returncode == 0
|
|
892
|
+
assert "WARN" not in r.stderr
|
|
893
|
+
|
|
894
|
+
|
|
895
|
+
def test_four_steps_warning_but_pass(tmp_path):
|
|
896
|
+
r = _run(_plan_with_steps(4), tmp_path)
|
|
897
|
+
assert r.returncode == 0, f"unexpected fail: {r.stderr}"
|
|
898
|
+
assert "WARN" in r.stderr and "step-count=4" in r.stderr
|
|
899
|
+
|
|
900
|
+
|
|
901
|
+
def test_six_steps_warning_but_pass(tmp_path):
|
|
902
|
+
r = _run(_plan_with_steps(6), tmp_path)
|
|
903
|
+
assert r.returncode == 0
|
|
904
|
+
assert "WARN" in r.stderr and "step-count=6" in r.stderr
|
|
905
|
+
|
|
906
|
+
|
|
907
|
+
def test_seven_steps_hard_fail(tmp_path):
|
|
908
|
+
r = _run(_plan_with_steps(7), tmp_path)
|
|
909
|
+
assert r.returncode != 0
|
|
910
|
+
assert "S5" in (r.stdout + r.stderr)
|
|
911
|
+
```
|
|
912
|
+
|
|
913
|
+
- [ ] **Step 2: 테스트 실패 확인**
|
|
914
|
+
|
|
915
|
+
```bash
|
|
916
|
+
python3 -m pytest tests/test_validate_implementation_plan_stages_warnings.py -v
|
|
917
|
+
```
|
|
918
|
+
|
|
919
|
+
Expected: 5 FAIL (현재 validator 는 warning 미발행).
|
|
920
|
+
|
|
921
|
+
- [ ] **Step 3: validator 코드 수정**
|
|
922
|
+
|
|
923
|
+
`validators/validate-implementation-plan-stages.py` L1 부근의 상수에 추가:
|
|
924
|
+
|
|
925
|
+
```python
|
|
926
|
+
RECOMMENDED_STEP_CAP = 3
|
|
927
|
+
HARD_STEP_CAP = 6
|
|
928
|
+
```
|
|
929
|
+
|
|
930
|
+
`_check_subsections` 함수 (현재 L131-146 부근) 안의 `if steps > 6:` 블록을 다음으로 교체:
|
|
931
|
+
|
|
932
|
+
```python
|
|
933
|
+
# S5: effective step count hard cap
|
|
934
|
+
steps = _count_effective_steps(section)
|
|
935
|
+
if steps > HARD_STEP_CAP:
|
|
936
|
+
errs.append(ValidationError("S5", s.stage_number,
|
|
937
|
+
f"effective step count {steps} exceeds hard cap {HARD_STEP_CAP}"))
|
|
938
|
+
elif steps > RECOMMENDED_STEP_CAP:
|
|
939
|
+
# Warning — exit code 영향 없음. stderr 로 한 줄.
|
|
940
|
+
import sys as _sys
|
|
941
|
+
print(
|
|
942
|
+
f"WARN stage {s.stage_number}: step-count={steps} exceeds "
|
|
943
|
+
f"recommended cap {RECOMMENDED_STEP_CAP} (hard cap {HARD_STEP_CAP}); "
|
|
944
|
+
"consider splitting this stage",
|
|
945
|
+
file=_sys.stderr,
|
|
946
|
+
)
|
|
947
|
+
```
|
|
948
|
+
|
|
949
|
+
(`import sys` 가 이미 파일 상단에 있다면 inline import 는 제거하고 모듈-level 만 사용.)
|
|
950
|
+
|
|
951
|
+
- [ ] **Step 4: 테스트 통과 확인**
|
|
952
|
+
|
|
953
|
+
```bash
|
|
954
|
+
python3 -m pytest tests/test_validate_implementation_plan_stages_warnings.py -v
|
|
955
|
+
```
|
|
956
|
+
|
|
957
|
+
Expected: 5 PASS.
|
|
958
|
+
|
|
959
|
+
- [ ] **Step 5: 기존 validator 테스트 회귀 확인**
|
|
960
|
+
|
|
961
|
+
```bash
|
|
962
|
+
python3 -m pytest tests/ -x -q -k validate
|
|
963
|
+
```
|
|
964
|
+
|
|
965
|
+
Expected: 회귀 0건.
|
|
966
|
+
|
|
967
|
+
- [ ] **Step 6: 커밋**
|
|
968
|
+
|
|
969
|
+
```bash
|
|
970
|
+
git add validators/validate-implementation-plan-stages.py tests/test_validate_implementation_plan_stages_warnings.py
|
|
971
|
+
git commit -m "feat(validators/plan-stages): warn (no fail) when stage step-count > 3"
|
|
972
|
+
```
|
|
973
|
+
|
|
974
|
+
### Task D3: template 예시 갱신 + CHANGES.md
|
|
975
|
+
|
|
976
|
+
**Files:**
|
|
977
|
+
- Modify: `templates/reports/implementation-planning-input.template.md`
|
|
978
|
+
- Modify: `CHANGES.md`
|
|
979
|
+
|
|
980
|
+
- [ ] **Step 1: 현재 예시 확인**
|
|
981
|
+
|
|
982
|
+
```bash
|
|
983
|
+
grep -n -A2 "stage | title | depends-on | step-count" templates/reports/implementation-planning-input.template.md
|
|
984
|
+
```
|
|
985
|
+
|
|
986
|
+
`step-count` 컬럼의 예시 값이 무엇인지 확인.
|
|
987
|
+
|
|
988
|
+
- [ ] **Step 2: 예시 step-count 값을 ≤3 으로 갱신**
|
|
989
|
+
|
|
990
|
+
`templates/reports/implementation-planning-input.template.md` 의 두 예시 Stage Map 표 (L111-119 부근) 의 step-count 컬럼 값을 모두 2 또는 3 으로 바꾸고, 표 직전에 다음 한 줄을 삽입:
|
|
991
|
+
|
|
992
|
+
```markdown
|
|
993
|
+
> 권장: stage 당 effective step count ≤ 3 (hard cap 6). 4 step 이상이면 split 가능 여부를 먼저 검토.
|
|
994
|
+
```
|
|
995
|
+
|
|
996
|
+
- [ ] **Step 3: `CHANGES.md` 사용자 영향 entry 추가**
|
|
997
|
+
|
|
998
|
+
```markdown
|
|
999
|
+
### feat(planning): stage 당 step 권장값을 ≤ 3 으로 강화 (hard cap 6 은 유지)
|
|
1000
|
+
|
|
1001
|
+
- **배경**: implementation phase 의 single stage 가 5-6 step 으로 partitioned 되면 executor 1 회 sync block 의 wall-clock 이 길어져 lead 의 single-session 처리 한계를 초과하는 패턴이 반복 관측되었다. Stage 가 짧을수록 한 lead 세션 안에서 dispatch → 검증 → persist 가 모두 끝날 가능성이 높다.
|
|
1002
|
+
- **변경**:
|
|
1003
|
+
- `prompts/profiles/implementation-planning.md` 의 "Effective row count ≤ 6" 룰을 "권장 ≤ 3, 절대 상한 6" 으로 강화. 4-6 step stage 는 정당화(예: "step 3, 4 의 산출물이 같은 commit 에 묶여야 한다") 한 줄을 plan 본문에 기록해야 함.
|
|
1004
|
+
- `validators/validate-implementation-plan-stages.py` 에 `step > 3` warning 추가 (stderr 한 줄, exit code 영향 없음). `step > 6` hard fail 은 기존대로 유지.
|
|
1005
|
+
- template 예시의 step-count 컬럼 값을 ≤ 3 으로 갱신.
|
|
1006
|
+
- **사용자 영향**: 다음 release 후 `npx -y okstra@latest install` 한 번이면 전파. **기존 plan 영향**: 이미 산출된 plan 은 그대로 (validator 가 warning 만 emit). **신규 implementation-planning run 부터**: report-writer 가 partition 을 한 단계 더 미세하게 가져갈 가능성이 높아지고, 따라서 implementation run 의 평균 wall-clock 이 짧아진다. **호환성 영향**: 4-6 step stage 도 여전히 통과하므로 breaking change 아님 — warning 은 다음 partition iteration 의 트리거일 뿐.
|
|
1007
|
+
```
|
|
1008
|
+
|
|
1009
|
+
- [ ] **Step 4: 회귀 확인**
|
|
1010
|
+
|
|
1011
|
+
```bash
|
|
1012
|
+
python3 -m pytest tests/ -x -q
|
|
1013
|
+
```
|
|
1014
|
+
|
|
1015
|
+
- [ ] **Step 5: 커밋**
|
|
1016
|
+
|
|
1017
|
+
```bash
|
|
1018
|
+
git add templates/reports/implementation-planning-input.template.md CHANGES.md
|
|
1019
|
+
git commit -m "docs(template,changes): update Stage Map examples to ≤3 steps"
|
|
1020
|
+
```
|
|
1021
|
+
|
|
1022
|
+
---
|
|
1023
|
+
|
|
1024
|
+
## Track C — implementation profile lazy split
|
|
1025
|
+
|
|
1026
|
+
**Why last:** B/D 적용 후 실측을 한 번 보고 진입 — 이미 lead context 가 충분히 슬림해졌다면 C 의 BLOCKING gate 설계 복잡도를 감수할 가치가 작다. 실측 후에도 lead 가 phase 6 안에서 못 끝낸다면 진행.
|
|
1027
|
+
|
|
1028
|
+
**Decision gate before C:** 다음 두 측정값이 모두 충족되면 C 진행. 그렇지 않으면 C 를 holdback 으로 분류하고 plan 을 닫음.
|
|
1029
|
+
|
|
1030
|
+
1. B/D 적용 후 첫 implementation 실측에서 lead totalTokens 가 baseline 대비 30 % 이상 감소했는가? (못 줄였다면 C 가 효과가 크지 않음)
|
|
1031
|
+
2. B/D 만으로도 phase 6 를 한 lead 세션 안에 끝낼 수 있는가? (그렇다면 C 는 over-engineering)
|
|
1032
|
+
|
|
1033
|
+
위 측정은 이 plan 의 범위 밖. 사용자에게 결과 보고 후 C 진입 여부 결정.
|
|
1034
|
+
|
|
1035
|
+
이하 task 들은 **C 진행이 결정된 후에만** 실행한다.
|
|
1036
|
+
|
|
1037
|
+
### Task C1: 3개 sidecar 파일 생성 (executor / verifier / deliverable)
|
|
1038
|
+
|
|
1039
|
+
**Files:**
|
|
1040
|
+
- Create: `prompts/profiles/_implementation-executor.md`
|
|
1041
|
+
- Create: `prompts/profiles/_implementation-verifier.md`
|
|
1042
|
+
- Create: `prompts/profiles/_implementation-deliverable.md`
|
|
1043
|
+
- Test: `tests/test_implementation_profile_split.py` (1차)
|
|
1044
|
+
|
|
1045
|
+
- [ ] **Step 1: 실패 테스트 작성 (sidecar 존재 + content invariant)**
|
|
1046
|
+
|
|
1047
|
+
```python
|
|
1048
|
+
# tests/test_implementation_profile_split.py
|
|
1049
|
+
from pathlib import Path
|
|
1050
|
+
|
|
1051
|
+
REPO_ROOT = Path(__file__).resolve().parents[1]
|
|
1052
|
+
PROFILES = REPO_ROOT / "prompts/profiles"
|
|
1053
|
+
MAIN = PROFILES / "implementation.md"
|
|
1054
|
+
EXECUTOR = PROFILES / "_implementation-executor.md"
|
|
1055
|
+
VERIFIER = PROFILES / "_implementation-verifier.md"
|
|
1056
|
+
DELIVERABLE = PROFILES / "_implementation-deliverable.md"
|
|
1057
|
+
|
|
1058
|
+
|
|
1059
|
+
def test_three_sidecars_exist():
|
|
1060
|
+
for p in (EXECUTOR, VERIFIER, DELIVERABLE):
|
|
1061
|
+
assert p.is_file(), f"sidecar missing: {p}"
|
|
1062
|
+
|
|
1063
|
+
|
|
1064
|
+
def test_executor_sidecar_owns_tdd_loop():
|
|
1065
|
+
content = EXECUTOR.read_text(encoding="utf-8")
|
|
1066
|
+
assert "Mandatory TDD loop" in content
|
|
1067
|
+
assert "red-green-refactor" in content
|
|
1068
|
+
assert "Sidecar evidence writer" in content
|
|
1069
|
+
|
|
1070
|
+
|
|
1071
|
+
def test_verifier_sidecar_owns_qa_duties():
|
|
1072
|
+
content = VERIFIER.read_text(encoding="utf-8")
|
|
1073
|
+
assert "Two-tier command lookup" in content
|
|
1074
|
+
assert "deny-list" in content
|
|
1075
|
+
assert "Read-only command log" in content
|
|
1076
|
+
|
|
1077
|
+
|
|
1078
|
+
def test_deliverable_sidecar_owns_final_report_shape():
|
|
1079
|
+
content = DELIVERABLE.read_text(encoding="utf-8")
|
|
1080
|
+
assert "Required deliverable shape" in content
|
|
1081
|
+
assert "Self-review pass" in content
|
|
1082
|
+
assert "Lead post-stage persistence" in content
|
|
1083
|
+
|
|
1084
|
+
|
|
1085
|
+
def test_sidecars_do_not_use_include_directive():
|
|
1086
|
+
"""sidecar 는 lead 가 직접 Read 하므로 INCLUDE 토큰 expand 불필요."""
|
|
1087
|
+
for p in (EXECUTOR, VERIFIER, DELIVERABLE):
|
|
1088
|
+
assert "{{INCLUDE:" not in p.read_text(encoding="utf-8"), \
|
|
1089
|
+
f"{p.name} should not rely on INCLUDE expansion"
|
|
1090
|
+
```
|
|
1091
|
+
|
|
1092
|
+
- [ ] **Step 2: 테스트 실패 확인**
|
|
1093
|
+
|
|
1094
|
+
```bash
|
|
1095
|
+
python3 -m pytest tests/test_implementation_profile_split.py -v
|
|
1096
|
+
```
|
|
1097
|
+
|
|
1098
|
+
Expected: 5 FAIL.
|
|
1099
|
+
|
|
1100
|
+
- [ ] **Step 3: `_implementation-executor.md` 작성 — 다음 섹션을 현재 `implementation.md` 에서 복사하여 옮긴다**
|
|
1101
|
+
|
|
1102
|
+
다음 섹션 전체 (현재 위치는 `implementation.md` L64-86 + L87-100):
|
|
1103
|
+
|
|
1104
|
+
- `Pre-implementation context exploration` (L64-78)
|
|
1105
|
+
- `Stage execution contract` (L79-86)
|
|
1106
|
+
- `Allowed actions during the run` (L87-100) — 단 verifier-specific 항목은 제외
|
|
1107
|
+
|
|
1108
|
+
추가로 파일 최상단에 다음 헤더를 둔다:
|
|
1109
|
+
|
|
1110
|
+
```markdown
|
|
1111
|
+
<!--
|
|
1112
|
+
Implementation profile — executor sidecar. Lead lazy-loads this file at the
|
|
1113
|
+
start of Phase 5 (executor dispatch). NOT included via `{{INCLUDE:}}` because
|
|
1114
|
+
lead context should NOT carry this body during Phase 1-4. Read once, retain
|
|
1115
|
+
until Phase 5 ends, then drop from active context for Phase 6/7.
|
|
1116
|
+
-->
|
|
1117
|
+
|
|
1118
|
+
# Implementation profile — Executor sidecar
|
|
1119
|
+
|
|
1120
|
+
> **When to read**: lead reads this file ONCE at the start of Phase 5 (after Stage Map parse, before issuing the Executor's first `Edit`/`Write`). The body governs ONLY the Executor role's behaviour. Verifier / report-writer behaviour lives in sibling sidecars.
|
|
1121
|
+
|
|
1122
|
+
```
|
|
1123
|
+
|
|
1124
|
+
이어서 원본 섹션을 그대로 붙여넣되, 헤딩 레벨을 `##` 으로 정규화한다 (`-` bullet 시작이었던 부분은 적절히 `## Pre-implementation context exploration` 등으로 승격).
|
|
1125
|
+
|
|
1126
|
+
- [ ] **Step 4: `_implementation-verifier.md` 작성**
|
|
1127
|
+
|
|
1128
|
+
다음 섹션을 옮긴다:
|
|
1129
|
+
|
|
1130
|
+
- `Verifier QA duties` 블록 (현재 `implementation.md` L20-40 부근)
|
|
1131
|
+
- `Forbidden actions` 의 verifier-specific bullet 들 (L113-121)
|
|
1132
|
+
- `All-verifier-failure policy` (L43)
|
|
1133
|
+
|
|
1134
|
+
헤더:
|
|
1135
|
+
|
|
1136
|
+
```markdown
|
|
1137
|
+
<!--
|
|
1138
|
+
Implementation profile — verifier sidecar. Lead lazy-loads this file ONCE
|
|
1139
|
+
at Phase 5, BEFORE constructing the verifier worker dispatch prompts.
|
|
1140
|
+
-->
|
|
1141
|
+
|
|
1142
|
+
# Implementation profile — Verifier sidecar
|
|
1143
|
+
|
|
1144
|
+
> **When to read**: lead reads this file ONCE in Phase 5, between executor stage completion and the first verifier dispatch. Carries the two-tier command lookup, deny-list, discrepancy rule, and verifier-specific forbidden actions.
|
|
1145
|
+
|
|
1146
|
+
```
|
|
1147
|
+
|
|
1148
|
+
- [ ] **Step 5: `_implementation-deliverable.md` 작성**
|
|
1149
|
+
|
|
1150
|
+
다음 섹션을 옮긴다:
|
|
1151
|
+
|
|
1152
|
+
- `Required deliverable shape` (L122-143)
|
|
1153
|
+
- `Self-review pass` (L144-154)
|
|
1154
|
+
- `Lead post-stage persistence` (L155-159)
|
|
1155
|
+
|
|
1156
|
+
헤더:
|
|
1157
|
+
|
|
1158
|
+
```markdown
|
|
1159
|
+
<!--
|
|
1160
|
+
Implementation profile — deliverable sidecar. Lead lazy-loads this file ONCE
|
|
1161
|
+
at the start of Phase 6 (report-writer dispatch), AFTER all worker results
|
|
1162
|
+
are collected and convergence finished. Phase 1-5 do not need it.
|
|
1163
|
+
-->
|
|
1164
|
+
|
|
1165
|
+
# Implementation profile — Deliverable sidecar
|
|
1166
|
+
|
|
1167
|
+
> **When to read**: lead reads this file ONCE at the start of Phase 6 (after Phase 5.5 convergence completes, before constructing the report-writer dispatch prompt). Carries the final-report deliverable shape, lead's post-stage persistence rules, and the self-review checklist.
|
|
1168
|
+
|
|
1169
|
+
```
|
|
1170
|
+
|
|
1171
|
+
- [ ] **Step 6: 테스트 통과 확인**
|
|
1172
|
+
|
|
1173
|
+
```bash
|
|
1174
|
+
python3 -m pytest tests/test_implementation_profile_split.py -v
|
|
1175
|
+
```
|
|
1176
|
+
|
|
1177
|
+
Expected: 5 PASS.
|
|
1178
|
+
|
|
1179
|
+
- [ ] **Step 7: 커밋**
|
|
1180
|
+
|
|
1181
|
+
```bash
|
|
1182
|
+
git add prompts/profiles/_implementation-executor.md prompts/profiles/_implementation-verifier.md prompts/profiles/_implementation-deliverable.md tests/test_implementation_profile_split.py
|
|
1183
|
+
git commit -m "feat(profiles): split implementation.md sections into three lazy-load sidecars"
|
|
1184
|
+
```
|
|
1185
|
+
|
|
1186
|
+
### Task C2: `implementation.md` 본체를 thin core 로 축소
|
|
1187
|
+
|
|
1188
|
+
**Files:**
|
|
1189
|
+
- Modify: `prompts/profiles/implementation.md`
|
|
1190
|
+
- Test: `tests/test_implementation_profile_split.py` (2차 — substring 보존)
|
|
1191
|
+
|
|
1192
|
+
- [ ] **Step 1: 실패 테스트 작성 (thin core 사이즈 + substring 보존)**
|
|
1193
|
+
|
|
1194
|
+
`tests/test_implementation_profile_split.py` 에 다음을 추가:
|
|
1195
|
+
|
|
1196
|
+
```python
|
|
1197
|
+
def test_main_profile_is_thin_core():
|
|
1198
|
+
"""thin core 는 약 50 줄 이하여야 한다 (baseline 비용 절감 목표)."""
|
|
1199
|
+
lines = MAIN.read_text(encoding="utf-8").splitlines()
|
|
1200
|
+
effective = [ln for ln in lines if ln.strip() and not ln.strip().startswith("<!--")]
|
|
1201
|
+
assert len(effective) <= 65, f"main profile should be ≤65 effective lines, got {len(effective)}"
|
|
1202
|
+
|
|
1203
|
+
|
|
1204
|
+
def test_main_profile_keeps_pre_implementation_gate():
|
|
1205
|
+
"""Phase 1 진입 시 필요한 gate 룰은 thin core 에 잔류해야 한다."""
|
|
1206
|
+
content = MAIN.read_text(encoding="utf-8")
|
|
1207
|
+
assert "Pre-implementation gate" in content
|
|
1208
|
+
assert "approved: true" in content
|
|
1209
|
+
|
|
1210
|
+
|
|
1211
|
+
def test_main_profile_keeps_task_worktree_block():
|
|
1212
|
+
"""Phase 1 의 worktree binding 도 thin core 에 잔류."""
|
|
1213
|
+
content = MAIN.read_text(encoding="utf-8")
|
|
1214
|
+
assert "Task worktree" in content
|
|
1215
|
+
assert "EXECUTOR_WORKTREE_PATH" in content
|
|
1216
|
+
|
|
1217
|
+
|
|
1218
|
+
def test_main_profile_has_lazy_section_pointers():
|
|
1219
|
+
"""thin core 는 3 개 sidecar 의 read 시점을 명시한 박스를 가져야 한다."""
|
|
1220
|
+
content = MAIN.read_text(encoding="utf-8")
|
|
1221
|
+
assert "_implementation-executor.md" in content
|
|
1222
|
+
assert "_implementation-verifier.md" in content
|
|
1223
|
+
assert "_implementation-deliverable.md" in content
|
|
1224
|
+
assert "Lazy section pointers" in content or "lazy section pointers" in content
|
|
1225
|
+
|
|
1226
|
+
|
|
1227
|
+
def test_main_plus_sidecars_cover_critical_tokens():
|
|
1228
|
+
"""split 전 implementation.md 의 핵심 토큰이 thin core + 3 sidecar 의 합에 모두 살아 있는지."""
|
|
1229
|
+
union = "\n".join(
|
|
1230
|
+
p.read_text(encoding="utf-8") for p in (MAIN, EXECUTOR, VERIFIER, DELIVERABLE)
|
|
1231
|
+
)
|
|
1232
|
+
critical_tokens = [
|
|
1233
|
+
# executor 측
|
|
1234
|
+
"Mandatory TDD loop",
|
|
1235
|
+
"red-green-refactor",
|
|
1236
|
+
"Stage execution contract",
|
|
1237
|
+
"Sidecar evidence writer",
|
|
1238
|
+
"Reverse link",
|
|
1239
|
+
"One-PR-per-stage",
|
|
1240
|
+
"Out-of-plan edits",
|
|
1241
|
+
"Commit message format",
|
|
1242
|
+
# verifier 측
|
|
1243
|
+
"Two-tier command lookup",
|
|
1244
|
+
"Tier 1 — plan validation set",
|
|
1245
|
+
"Tier 2 — project baseline",
|
|
1246
|
+
"deny-list",
|
|
1247
|
+
"Discrepancy rule",
|
|
1248
|
+
"Read-only command log",
|
|
1249
|
+
"All-verifier-failure policy",
|
|
1250
|
+
# deliverable 측
|
|
1251
|
+
"Plan link & approval evidence",
|
|
1252
|
+
"Stage sidecar evidence",
|
|
1253
|
+
"Validation evidence",
|
|
1254
|
+
"TDD evidence",
|
|
1255
|
+
"Verifier results",
|
|
1256
|
+
"Rollback verification",
|
|
1257
|
+
"Routing recommendation for `final-verification`",
|
|
1258
|
+
"Follow-up tasks",
|
|
1259
|
+
"Self-review pass before finalising the report",
|
|
1260
|
+
"Lead post-stage persistence",
|
|
1261
|
+
# thin core 측
|
|
1262
|
+
"Pre-implementation gate",
|
|
1263
|
+
"Task worktree",
|
|
1264
|
+
"EXECUTOR_WORKTREE_PATH",
|
|
1265
|
+
]
|
|
1266
|
+
missing = [t for t in critical_tokens if t not in union]
|
|
1267
|
+
assert not missing, f"tokens dropped during split: {missing}"
|
|
1268
|
+
```
|
|
1269
|
+
|
|
1270
|
+
- [ ] **Step 2: 테스트 실패 확인 — `test_main_profile_is_thin_core` 등이 FAIL 이어야 함**
|
|
1271
|
+
|
|
1272
|
+
```bash
|
|
1273
|
+
python3 -m pytest tests/test_implementation_profile_split.py -v -k "thin_core or lazy_section or critical_tokens"
|
|
1274
|
+
```
|
|
1275
|
+
|
|
1276
|
+
Expected: 일부 FAIL (현재 implementation.md 가 아직 161 줄).
|
|
1277
|
+
|
|
1278
|
+
- [ ] **Step 3: `implementation.md` 본체를 thin core 로 교체**
|
|
1279
|
+
|
|
1280
|
+
`prompts/profiles/implementation.md` 전체를 다음으로 교체:
|
|
1281
|
+
|
|
1282
|
+
```markdown
|
|
1283
|
+
# Implementation Profile
|
|
1284
|
+
|
|
1285
|
+
- Purpose: realise the approved `implementation-planning` deliverable as actual source changes, with cross-model verification, while keeping the run reversible
|
|
1286
|
+
- Required workers:
|
|
1287
|
+
- claude
|
|
1288
|
+
- codex
|
|
1289
|
+
- report-writer
|
|
1290
|
+
- Optional workers (opt-in via `--workers`):
|
|
1291
|
+
- gemini — when added to the roster it joins the verifier set; when omitted only the Claude+Codex verifiers participate (`--executor gemini` is therefore not selectable without explicitly listing `gemini` in `--workers`)
|
|
1292
|
+
- **Executor binding (resolved at run-prep time, fixed for this run):**
|
|
1293
|
+
- Executor display name: `{{EXECUTOR_DISPLAY_NAME}}`
|
|
1294
|
+
- Executor provider: `{{EXECUTOR_PROVIDER}}` (one of: `claude` | `codex` | `gemini`; chosen via `--executor` or `OKSTRA_DEFAULT_EXECUTOR`, default `claude`)
|
|
1295
|
+
- Executor subagent for dispatch: `{{EXECUTOR_WORKER_AGENT}}`
|
|
1296
|
+
- Executor model: `{{EXECUTOR_MODEL_DISPLAY}}` (launch value: `{{EXECUTOR_MODEL_EXECUTION_VALUE}}`)
|
|
1297
|
+
- Wherever this profile mentions the `Executor`, it refers to the role bound above. The other two providers in the roster (`claude` / `codex` / `gemini` minus the executor) are dispatched as **verifiers only** for this run and remain strictly read-only.
|
|
1298
|
+
{{INCLUDE:_common-contract.md}}
|
|
1299
|
+
- Pre-implementation gate (mandatory — refuse to start if any item fails):
|
|
1300
|
+
- the run brief MUST cite `--approved-plan <path>` pointing to a `final-report.md` produced by a prior `implementation-planning` run located under `runs/implementation-planning/.../reports/final-report.md`
|
|
1301
|
+
- that file's YAML frontmatter MUST carry `approved: true`. report-writer emits `approved: false` by default; the user flips it to `true` to authorise this run. Free-form approvals such as "lgtm" / "go ahead" / paraphrased confirmations are NOT accepted; re-edit the plan file's frontmatter to `approved: true` before invoking implementation, or pass `--approve` so the CLI flips it on the user's behalf (`okstra_ctl.run._apply_cli_approval`).
|
|
1302
|
+
- The `--approve` flag is meaningful ONLY with `--task-type implementation` and `--approved-plan <path>`; any other use raises `PrepareError`. Idempotent — re-running with `approved: true` already set appends an audit line but does NOT re-toggle.
|
|
1303
|
+
- the file's `Recommended option` and its bite-sized step list become the authoritative scope for this run; deviations must be justified in the final report and routed back to a new `implementation-planning` run rather than silently expanded.
|
|
1304
|
+
- Task worktree (provisioned by `okstra-ctl` at the first phase's run-prep time, reused for every subsequent phase of this task-key):
|
|
1305
|
+
- Status: `{{EXECUTOR_WORKTREE_STATUS}}` (one of: `created` | `reused` | `skipped-in-worktree` | `skipped-not-git`)
|
|
1306
|
+
- Working tree path: `{{EXECUTOR_WORKTREE_PATH}}` — when status is `created` or `reused`, this is the task's `git worktree` rooted at `~/.okstra/worktrees/<project>/<task-group>/<task-id>/`. When skipped, this is the caller's `project_root`.
|
|
1307
|
+
- Branch: `{{EXECUTOR_WORKTREE_BRANCH}}` — empty when status is `skipped-*`. Branch name = `<work-category-prefix>-<task-id-segment>`, globally unique via `~/.okstra/worktrees/registry.json`.
|
|
1308
|
+
- Base ref: `{{EXECUTOR_WORKTREE_BASE_REF}}` — canonical `<base>` for every `git diff` / `git log` in this run.
|
|
1309
|
+
- Provisioning note: `{{EXECUTOR_WORKTREE_NOTE}}`
|
|
1310
|
+
- Treat the working-tree path as `project_root` for the duration of this run. Do NOT mutate the caller's original checkout. cwd-sensitive Bash commands MUST be prefixed `cd {{EXECUTOR_WORKTREE_PATH}} && ` in the same Bash invocation (never `bash -lc "..."` wrappers — see executor sidecar for full rules).
|
|
1311
|
+
- Lifecycle: kept after the run completes; reused by every subsequent phase of the same task-key. Manual cleanup: `git worktree remove <path>` → `git branch -D <branch>` → drop registry entry.
|
|
1312
|
+
- Approval gate (phase-specific addendum to shared authority rule):
|
|
1313
|
+
- the pre-implementation gate's recorded user approval marker is the only authorised approval gate at this phase — proceed once it is satisfied without further external coordination.
|
|
1314
|
+
- Forbidden actions — universal (any occurrence → terminal status `contract-violated`):
|
|
1315
|
+
- **`git push` of any kind**, including `--dry-run` against a real remote that produces side-effects
|
|
1316
|
+
- publishing or release commands: `npm publish`, `cargo publish`, `pip publish`, `gh release`, `docker push`
|
|
1317
|
+
- real database migrations or schema changes that touch shared environments
|
|
1318
|
+
- production credentials, deploy commands, infra mutation against non-local clusters
|
|
1319
|
+
- external API WRITE calls (POST/PUT/PATCH/DELETE) to third-party services other than localhost test fixtures
|
|
1320
|
+
- dispatching parallel sub-agents beyond the required worker roster
|
|
1321
|
+
- any Edit/Write before the pre-implementation gate has passed
|
|
1322
|
+
- source edits or Bash mutations performed by any verifier role (verifier-specific deny-list lives in the verifier sidecar)
|
|
1323
|
+
- In-phase debugging:
|
|
1324
|
+
- isolate root cause before changing the fix direction, but the executor MUST NOT route to a separate `error-analysis` phase mid-run; if a defect blocks plan progress, the executor records findings and routes to a new run after this one ends.
|
|
1325
|
+
|
|
1326
|
+
## Lazy section pointers (BLOCKING for lead — load at the listed phase, not at Phase 1)
|
|
1327
|
+
|
|
1328
|
+
The bulk of this profile's body is split into three sidecars so the lead's Phase 1 baseline stays under ~50 effective lines. Read each sidecar ONCE, at the phase noted, into the lead's active context — do NOT pre-load them at Phase 1.
|
|
1329
|
+
|
|
1330
|
+
| Sidecar | Read at | Purpose |
|
|
1331
|
+
|---------|---------|---------|
|
|
1332
|
+
| `prompts/profiles/_implementation-executor.md` | Start of Phase 5 (after Stage Map parse, before Executor's first Edit/Write) | Pre-implementation context exploration, TDD loop, Stage execution contract, allowed actions, commit-message format |
|
|
1333
|
+
| `prompts/profiles/_implementation-verifier.md` | Phase 5, between Executor stage completion and the first verifier dispatch | Two-tier command lookup, deny-list, discrepancy rule, Read-only command log, verifier-specific forbidden actions |
|
|
1334
|
+
| `prompts/profiles/_implementation-deliverable.md` | Start of Phase 6 (after Phase 5.5 convergence completes, before report-writer dispatch prompt construction) | Required deliverable shape, Validation/TDD evidence rules, Verifier results structure, Self-review pass, Lead post-stage persistence |
|
|
1335
|
+
|
|
1336
|
+
**Phase 5 / 6 진입 시 해당 sidecar 가 lead context 에 없으면 BLOCKING — phase 진입 거부.** Lead 는 sidecar 를 read 한 후 1 회 turn 안에 phase 의 후속 action 으로 이어가야 한다 (즉 sidecar 의 룰은 read 한 그 turn 부터 효력 발생).
|
|
1337
|
+
|
|
1338
|
+
```
|
|
1339
|
+
|
|
1340
|
+
- [ ] **Step 4: 테스트 통과 확인**
|
|
1341
|
+
|
|
1342
|
+
```bash
|
|
1343
|
+
python3 -m pytest tests/test_implementation_profile_split.py -v
|
|
1344
|
+
```
|
|
1345
|
+
|
|
1346
|
+
Expected: 모든 테스트 PASS.
|
|
1347
|
+
|
|
1348
|
+
- [ ] **Step 5: 커밋**
|
|
1349
|
+
|
|
1350
|
+
```bash
|
|
1351
|
+
git add prompts/profiles/implementation.md tests/test_implementation_profile_split.py
|
|
1352
|
+
git commit -m "refactor(profiles/implementation): reduce main profile to thin core + 3 lazy sidecar pointers"
|
|
1353
|
+
```
|
|
1354
|
+
|
|
1355
|
+
### Task C3: lead 의 lazy reading discipline 룰 추가 (`agents/SKILL.md`)
|
|
1356
|
+
|
|
1357
|
+
**Files:**
|
|
1358
|
+
- Modify: `agents/SKILL.md` (Phase 1 박스 뒤에 새 박스)
|
|
1359
|
+
- Test: `tests/test_lead_lazy_reading_discipline.py`
|
|
1360
|
+
|
|
1361
|
+
- [ ] **Step 1: 실패 테스트 작성**
|
|
1362
|
+
|
|
1363
|
+
```python
|
|
1364
|
+
# tests/test_lead_lazy_reading_discipline.py
|
|
1365
|
+
from pathlib import Path
|
|
1366
|
+
import re
|
|
1367
|
+
|
|
1368
|
+
REPO_ROOT = Path(__file__).resolve().parents[1]
|
|
1369
|
+
LEAD = REPO_ROOT / "agents/SKILL.md"
|
|
1370
|
+
|
|
1371
|
+
|
|
1372
|
+
def test_lead_skill_documents_implementation_lazy_pointers():
|
|
1373
|
+
content = LEAD.read_text(encoding="utf-8")
|
|
1374
|
+
assert "Implementation profile lazy reading discipline" in content
|
|
1375
|
+
|
|
1376
|
+
|
|
1377
|
+
def test_lead_skill_names_three_sidecars_with_read_timing():
|
|
1378
|
+
content = LEAD.read_text(encoding="utf-8")
|
|
1379
|
+
box = content[content.index("Implementation profile lazy reading discipline"):]
|
|
1380
|
+
# 3 sidecar 파일명 + 진입 phase 표기
|
|
1381
|
+
for sidecar, phase_marker in [
|
|
1382
|
+
("_implementation-executor.md", "Phase 5"),
|
|
1383
|
+
("_implementation-verifier.md", "Phase 5"),
|
|
1384
|
+
("_implementation-deliverable.md", "Phase 6"),
|
|
1385
|
+
]:
|
|
1386
|
+
assert sidecar in box[:2000], f"missing sidecar reference: {sidecar}"
|
|
1387
|
+
# phase 표기는 sidecar 라인 인근에 있어야 함
|
|
1388
|
+
idx = box.index(sidecar)
|
|
1389
|
+
nearby = box[max(0, idx - 200):idx + 200]
|
|
1390
|
+
assert phase_marker in nearby, f"{sidecar} not annotated with {phase_marker}"
|
|
1391
|
+
|
|
1392
|
+
|
|
1393
|
+
def test_lead_skill_blocks_phase_entry_without_sidecar():
|
|
1394
|
+
content = LEAD.read_text(encoding="utf-8")
|
|
1395
|
+
box = content[content.index("Implementation profile lazy reading discipline"):][:2500]
|
|
1396
|
+
assert re.search(r"BLOCKING", box)
|
|
1397
|
+
assert "phase 진입 거부" in box or "phase entry refused" in box.lower()
|
|
1398
|
+
```
|
|
1399
|
+
|
|
1400
|
+
- [ ] **Step 2: 테스트 실패 확인**
|
|
1401
|
+
|
|
1402
|
+
```bash
|
|
1403
|
+
python3 -m pytest tests/test_lead_lazy_reading_discipline.py -v
|
|
1404
|
+
```
|
|
1405
|
+
|
|
1406
|
+
Expected: 3 FAIL.
|
|
1407
|
+
|
|
1408
|
+
- [ ] **Step 3: `agents/SKILL.md` Phase 1 박스 뒤 ("Lazy reading discipline" 섹션 끝) 에 새 박스 추가**
|
|
1409
|
+
|
|
1410
|
+
`agents/SKILL.md` 의 `**Lazy reading discipline (do NOT read at Phase 1):**` bullet 리스트 (L163-169) 끝, 다음 빈 줄에 다음을 삽입:
|
|
1411
|
+
|
|
1412
|
+
```markdown
|
|
1413
|
+
**Implementation profile lazy reading discipline (BLOCKING — applies only when `task_type == "implementation"`):**
|
|
1414
|
+
|
|
1415
|
+
The `implementation` profile's thin core (`prompts/profiles/implementation.md`) is intentionally minimal so the Phase 1 baseline stays small. Three sidecar files carry the bulk of the rules and MUST be read at the listed phase — do NOT pre-load them at Phase 1.
|
|
1416
|
+
|
|
1417
|
+
| Sidecar | Read at | Owned by |
|
|
1418
|
+
|---------|---------|----------|
|
|
1419
|
+
| `prompts/profiles/_implementation-executor.md` | **Phase 5**, after Stage Map parse, BEFORE issuing the Executor's first `Edit` / `Write` | Pre-implementation context exploration, TDD loop, Stage execution contract, allowed actions, commit-message format |
|
|
1420
|
+
| `prompts/profiles/_implementation-verifier.md` | **Phase 5**, between Executor stage completion and the first verifier dispatch | Two-tier command lookup, deny-list, discrepancy rule, Read-only command log, verifier-specific forbidden actions |
|
|
1421
|
+
| `prompts/profiles/_implementation-deliverable.md` | **Phase 6**, after Phase 5.5 convergence completes, BEFORE constructing the report-writer dispatch prompt | Required deliverable shape, Validation / TDD evidence rules, Verifier results structure, Self-review pass, Lead post-stage persistence |
|
|
1422
|
+
|
|
1423
|
+
**Entry guard (BLOCKING).** Before transitioning into Phase 5 or Phase 6 for an `implementation` run, lead MUST emit a single Read tool call for the sidecar(s) above whose `Read at` matches the entering phase. If lead enters the phase without that Read on record (verified via lead session jsonl by `tests/` and observable to a human auditor), the phase 진입 거부 — lead writes a `contract-violation` to the run-level errors log with `--message "implementation-sidecar-not-loaded"` and stops. Re-entry requires the sidecar Read first.
|
|
1424
|
+
|
|
1425
|
+
The guard is not satisfied by remembering content from a prior run — each implementation run reads the sidecar fresh, because the sidecars are part of the runtime shipped via `okstra install` and may have been updated between runs.
|
|
1426
|
+
|
|
1427
|
+
This pattern is implementation-only. Other profiles (`requirements-discovery`, `error-analysis`, `implementation-planning`, `final-verification`, `release-handoff`) load their whole profile body at Phase 1 as before — they are short enough not to benefit from a split.
|
|
1428
|
+
```
|
|
1429
|
+
|
|
1430
|
+
- [ ] **Step 4: 테스트 통과 확인**
|
|
1431
|
+
|
|
1432
|
+
```bash
|
|
1433
|
+
python3 -m pytest tests/test_lead_lazy_reading_discipline.py -v
|
|
1434
|
+
```
|
|
1435
|
+
|
|
1436
|
+
Expected: 3 PASS.
|
|
1437
|
+
|
|
1438
|
+
- [ ] **Step 5: 커밋**
|
|
1439
|
+
|
|
1440
|
+
```bash
|
|
1441
|
+
git add agents/SKILL.md tests/test_lead_lazy_reading_discipline.py
|
|
1442
|
+
git commit -m "feat(agents/SKILL): document implementation-profile lazy reading discipline with BLOCKING entry guard"
|
|
1443
|
+
```
|
|
1444
|
+
|
|
1445
|
+
### Task C4: validator 가 sidecar split 후에도 substring 검사를 통과하도록 갱신
|
|
1446
|
+
|
|
1447
|
+
**Files:**
|
|
1448
|
+
- Modify: `validators/validate-run.py` (필요 시)
|
|
1449
|
+
- Test: `tests/test_validate_run_implementation_substrings.py`
|
|
1450
|
+
|
|
1451
|
+
- [ ] **Step 1: 현재 validator 가 implementation 산출 보고서에서 검사하는 substring 들 파악**
|
|
1452
|
+
|
|
1453
|
+
```bash
|
|
1454
|
+
grep -n "implementation\|TDD evidence\|Validation evidence\|Verifier results\|Out-of-plan edits\|Stage sidecar evidence" validators/validate-run.py
|
|
1455
|
+
```
|
|
1456
|
+
|
|
1457
|
+
substring 검사가 산출 보고서(final-report.md) 자체를 대상으로 한다면, sidecar split 은 영향 없음 (보고서는 여전히 deliverable 명세를 따라야 하므로). 검사가 profile body 를 대상으로 한다면 (예: profile 안에 특정 토큰이 있는지) 갱신 필요.
|
|
1458
|
+
|
|
1459
|
+
- [ ] **Step 2: 실패 테스트 작성**
|
|
1460
|
+
|
|
1461
|
+
```python
|
|
1462
|
+
# tests/test_validate_run_implementation_substrings.py
|
|
1463
|
+
"""validate-run.py 의 implementation profile 검사가 sidecar split 후에도 작동하는지."""
|
|
1464
|
+
import subprocess
|
|
1465
|
+
import sys
|
|
1466
|
+
from pathlib import Path
|
|
1467
|
+
|
|
1468
|
+
REPO_ROOT = Path(__file__).resolve().parents[1]
|
|
1469
|
+
|
|
1470
|
+
|
|
1471
|
+
def test_validator_accepts_split_profile_present():
|
|
1472
|
+
"""thin core + 3 sidecar 모두 존재하면 validator 가 'implementation profile incomplete' 류 에러를 발하지 않아야 한다."""
|
|
1473
|
+
# 단순 self-check: validator 가 profile 파일을 inspect 한다면 그 inspect 가 sidecar 도 cover 해야 함
|
|
1474
|
+
# 구체 명령은 validate-run.py 의 실제 호출 형태에 맞춰 조정
|
|
1475
|
+
r = subprocess.run(
|
|
1476
|
+
[sys.executable, str(REPO_ROOT / "validators/validate-run.py"), "--check-profile", "implementation"],
|
|
1477
|
+
capture_output=True, text=True,
|
|
1478
|
+
)
|
|
1479
|
+
# validate-run.py 가 --check-profile 플래그를 지원하지 않으면 이 테스트는 skip
|
|
1480
|
+
if r.returncode == 2 and "unrecognized" in r.stderr.lower():
|
|
1481
|
+
import pytest
|
|
1482
|
+
pytest.skip("validate-run.py does not expose --check-profile; substring check is report-only")
|
|
1483
|
+
assert "implementation profile incomplete" not in r.stderr.lower()
|
|
1484
|
+
assert "sidecar" not in r.stderr.lower() or "missing" not in r.stderr.lower()
|
|
1485
|
+
```
|
|
1486
|
+
|
|
1487
|
+
- [ ] **Step 3: 테스트 실행 — 현재 validator 형태에 따라 결과 분기**
|
|
1488
|
+
|
|
1489
|
+
```bash
|
|
1490
|
+
python3 -m pytest tests/test_validate_run_implementation_substrings.py -v
|
|
1491
|
+
```
|
|
1492
|
+
|
|
1493
|
+
PASS / SKIP 이면 Step 5 로. FAIL 이면 다음 step.
|
|
1494
|
+
|
|
1495
|
+
- [ ] **Step 4: (FAIL 인 경우만) validator 갱신**
|
|
1496
|
+
|
|
1497
|
+
`validators/validate-run.py` 가 implementation profile 의 특정 substring 을 profile 본체에서 직접 찾는 패턴이면, 검색 대상에 sidecar 3 파일을 추가한다. 코드 패턴 예시:
|
|
1498
|
+
|
|
1499
|
+
```python
|
|
1500
|
+
# 변경 전:
|
|
1501
|
+
profile_text = (REPO_ROOT / "prompts/profiles/implementation.md").read_text()
|
|
1502
|
+
for token in REQUIRED_TOKENS:
|
|
1503
|
+
assert token in profile_text, ...
|
|
1504
|
+
|
|
1505
|
+
# 변경 후:
|
|
1506
|
+
profile_paths = [
|
|
1507
|
+
REPO_ROOT / "prompts/profiles/implementation.md",
|
|
1508
|
+
REPO_ROOT / "prompts/profiles/_implementation-executor.md",
|
|
1509
|
+
REPO_ROOT / "prompts/profiles/_implementation-verifier.md",
|
|
1510
|
+
REPO_ROOT / "prompts/profiles/_implementation-deliverable.md",
|
|
1511
|
+
]
|
|
1512
|
+
profile_text = "\n".join(p.read_text() for p in profile_paths)
|
|
1513
|
+
for token in REQUIRED_TOKENS:
|
|
1514
|
+
assert token in profile_text, ...
|
|
1515
|
+
```
|
|
1516
|
+
|
|
1517
|
+
(실제 패턴은 Step 1 의 grep 결과에 맞춰 수정.)
|
|
1518
|
+
|
|
1519
|
+
- [ ] **Step 5: 회귀 + e2e**
|
|
1520
|
+
|
|
1521
|
+
```bash
|
|
1522
|
+
python3 -m pytest tests/ -x -q
|
|
1523
|
+
bash validators/validate-workflow.sh
|
|
1524
|
+
```
|
|
1525
|
+
|
|
1526
|
+
- [ ] **Step 6: 커밋**
|
|
1527
|
+
|
|
1528
|
+
```bash
|
|
1529
|
+
git add validators/validate-run.py tests/test_validate_run_implementation_substrings.py
|
|
1530
|
+
git commit -m "fix(validators/validate-run): include implementation sidecars in substring scan"
|
|
1531
|
+
```
|
|
1532
|
+
|
|
1533
|
+
(코드 변경이 없었다면 test 만 commit:)
|
|
1534
|
+
|
|
1535
|
+
```bash
|
|
1536
|
+
git add tests/test_validate_run_implementation_substrings.py
|
|
1537
|
+
git commit -m "test(validators): pin implementation sidecar discovery"
|
|
1538
|
+
```
|
|
1539
|
+
|
|
1540
|
+
### Task C5: install 파이프라인 + CHANGES.md + CLAUDE.md
|
|
1541
|
+
|
|
1542
|
+
**Files:**
|
|
1543
|
+
- Modify: `src/install.mjs` (sidecar 파일들이 user machine 에 sync 되는지 확인)
|
|
1544
|
+
- Modify: `CLAUDE.md` "Where to find things"
|
|
1545
|
+
- Modify: `CHANGES.md`
|
|
1546
|
+
|
|
1547
|
+
- [ ] **Step 1: install 파이프라인 검증**
|
|
1548
|
+
|
|
1549
|
+
```bash
|
|
1550
|
+
node bin/okstra --version
|
|
1551
|
+
# install dry-run 또는 sandbox install
|
|
1552
|
+
OKSTRA_HOME=$(mktemp -d) node bin/okstra install --dry-run 2>&1 | grep -E "(executor|verifier|deliverable|fragments)"
|
|
1553
|
+
```
|
|
1554
|
+
|
|
1555
|
+
`_implementation-*.md` 와 `prompts/fragments/*.md` 가 user machine 의 `~/.okstra/lib/python/prompts/` (또는 동등 위치) 로 sync 되는지 확인.
|
|
1556
|
+
|
|
1557
|
+
이미 디렉토리 단위 rsync 라면 자동으로 포함됨. 파일 단위 allowlist 라면 추가 필요.
|
|
1558
|
+
|
|
1559
|
+
- [ ] **Step 2: 누락 시 `src/install.mjs` (또는 `tools/build.mjs`) 갱신**
|
|
1560
|
+
|
|
1561
|
+
(생략 — Step 1 결과에 따라 조건부)
|
|
1562
|
+
|
|
1563
|
+
- [ ] **Step 3: `CLAUDE.md` "Where to find things" 표에 항목 추가**
|
|
1564
|
+
|
|
1565
|
+
`CLAUDE.md` 의 표에 두 줄 추가:
|
|
1566
|
+
|
|
1567
|
+
```markdown
|
|
1568
|
+
| `[Required reading]` / `[Error reporting]` clause SSOT body | [`prompts/fragments/required-reading-block.md`](prompts/fragments/required-reading-block.md), [`prompts/fragments/error-reporting-block.md`](prompts/fragments/error-reporting-block.md) — dispatch prompt 에 inline 되는 본문 |
|
|
1569
|
+
| Implementation profile lazy sidecars | [`prompts/profiles/_implementation-executor.md`](prompts/profiles/_implementation-executor.md), [`_implementation-verifier.md`](prompts/profiles/_implementation-verifier.md), [`_implementation-deliverable.md`](prompts/profiles/_implementation-deliverable.md) — lead 가 Phase 5 / 6 에서 lazy read |
|
|
1570
|
+
```
|
|
1571
|
+
|
|
1572
|
+
- [ ] **Step 4: `CHANGES.md` 사용자 영향 entry 추가**
|
|
1573
|
+
|
|
1574
|
+
```markdown
|
|
1575
|
+
### refactor(profiles/implementation): 161-line profile 을 thin core (~50 lines) + 3 lazy sidecar 로 split
|
|
1576
|
+
|
|
1577
|
+
- **배경**: `implementation.md` 가 161 줄까지 비대해져 lead 의 Phase 1 baseline cost 의 큰 부분을 차지. 모든 룰이 모든 phase 에 필요한 것이 아니므로 (예: Verifier QA duties 는 Phase 5 verifier dispatch 직전에만 필요) lazy load 가 가능했다.
|
|
1578
|
+
- **변경**:
|
|
1579
|
+
- `prompts/profiles/_implementation-executor.md` 신규 — Pre-implementation context exploration, TDD loop, Stage execution contract, allowed actions, commit-message format. Phase 5 진입 시점 lazy load.
|
|
1580
|
+
- `prompts/profiles/_implementation-verifier.md` 신규 — Two-tier command lookup, deny-list, discrepancy rule, Read-only command log, verifier-specific forbidden actions. Phase 5 verifier dispatch 직전 lazy load.
|
|
1581
|
+
- `prompts/profiles/_implementation-deliverable.md` 신규 — Required deliverable shape, Validation/TDD evidence rules, Verifier results structure, Self-review pass, Lead post-stage persistence. Phase 6 진입 시점 lazy load.
|
|
1582
|
+
- `prompts/profiles/implementation.md` 본체를 thin core (~50 줄) 로 축소 — Purpose, Required workers, Executor binding, common-contract INCLUDE, Pre-implementation gate, Task worktree, 공통 Forbidden actions, In-phase debugging, Lazy section pointers.
|
|
1583
|
+
- `agents/SKILL.md` 에 "Implementation profile lazy reading discipline" BLOCKING 박스 추가 — Phase 5/6 진입 시 해당 sidecar 의 Read 가 lead session jsonl 에 기록되지 않으면 `contract-violation` 으로 phase 진입 거부.
|
|
1584
|
+
- **사용자 영향**: 다음 release 후 `npx -y okstra@latest install` 한 번이면 전파. lead 의 implementation phase Phase 1 baseline 이 161 줄 (profile) → ~50 줄 (thin core) 으로 축소 — 매 turn cache_read 비용 감소. **호환성 영향**: 룰 자체는 모두 살아 있음 (단지 다른 파일로 옮김). `okstra install` 가 sidecar 3 파일을 user machine 에 sync 함. 자체 작성한 implementation profile 변형은 그대로 작동 (thin core 만 비대해질 뿐, 룰 누락은 없음).
|
|
1585
|
+
```
|
|
1586
|
+
|
|
1587
|
+
- [ ] **Step 5: 회귀 + commit**
|
|
1588
|
+
|
|
1589
|
+
```bash
|
|
1590
|
+
python3 -m pytest tests/ -x -q
|
|
1591
|
+
bash validators/validate-workflow.sh
|
|
1592
|
+
git add CLAUDE.md CHANGES.md src/install.mjs tools/build.mjs
|
|
1593
|
+
git commit -m "docs(claude-md,changes): document implementation profile lazy split"
|
|
1594
|
+
```
|
|
1595
|
+
|
|
1596
|
+
(install/build 변경이 없었으면 두 파일 제외.)
|
|
1597
|
+
|
|
1598
|
+
---
|
|
1599
|
+
|
|
1600
|
+
## Final integration
|
|
1601
|
+
|
|
1602
|
+
### Task FINAL: 전체 e2e + token-usage smoke + dogfooding
|
|
1603
|
+
|
|
1604
|
+
- [ ] **Step 1: 풀 e2e 시나리오 모두**
|
|
1605
|
+
|
|
1606
|
+
```bash
|
|
1607
|
+
for s in tests-e2e/scenario-*.sh; do
|
|
1608
|
+
echo "=== $s ==="
|
|
1609
|
+
bash "$s" || { echo "FAIL: $s"; exit 1; }
|
|
1610
|
+
done
|
|
1611
|
+
```
|
|
1612
|
+
|
|
1613
|
+
- [ ] **Step 2: `npm run build` 후 사용자 install 시뮬레이션**
|
|
1614
|
+
|
|
1615
|
+
```bash
|
|
1616
|
+
npm run build
|
|
1617
|
+
OKSTRA_HOME=$(mktemp -d) node bin/okstra install
|
|
1618
|
+
node bin/okstra doctor
|
|
1619
|
+
```
|
|
1620
|
+
|
|
1621
|
+
`doctor` 의 출력에서 fragment 파일 + 3 sidecar 가 모두 보고되어야 함. 보이지 않으면 install/build 누락.
|
|
1622
|
+
|
|
1623
|
+
- [ ] **Step 3: dogfooding — 작은 implementation task 로 한 번 돌려본다**
|
|
1624
|
+
|
|
1625
|
+
```bash
|
|
1626
|
+
# 실제 작은 task 하나를 골라 okstra --task-type implementation 실행
|
|
1627
|
+
# (이 plan 의 범위는 변경의 land 까지. 측정은 follow-up.)
|
|
1628
|
+
```
|
|
1629
|
+
|
|
1630
|
+
token-usage 결과를 비교 baseline (B/D/C 적용 전 한 번 + 적용 후 한 번) 으로 수집해 plan 의 효과 측정. 수집된 수치는 follow-up issue 또는 다음 release note 의 evidence.
|
|
1631
|
+
|
|
1632
|
+
- [ ] **Step 4: PR 생성**
|
|
1633
|
+
|
|
1634
|
+
```bash
|
|
1635
|
+
gh pr create --base main --title "Implementation lead context slimming: B(fragment-pointers) + D(stage-step-cap-3) + C(profile-lazy-split)" --body "$(cat <<'EOF'
|
|
1636
|
+
## Summary
|
|
1637
|
+
- Track B: `[Required reading]` / `[Error reporting]` clause 본문을 `prompts/fragments/` SSOT 로 분리, lead 의 dispatch prompt 는 pointer 1줄만 emit. codex/gemini wrapper subagent 가 dispatch 직전 fragment 를 resolve 해서 CLI stdin 에 inline.
|
|
1638
|
+
- Track D: implementation-planning profile 의 stage step 권장값을 ≤ 3 으로 강화 (hard cap 6 유지). validator 는 step > 3 일 때 stderr warning 만 발행 (exit code 영향 없음).
|
|
1639
|
+
- Track C: `prompts/profiles/implementation.md` 161 줄을 thin core (~50 줄) + 3 lazy sidecar (executor / verifier / deliverable) 로 split. lead 는 Phase 5 / 6 진입 시점에 해당 sidecar 만 Read; 미Read 시 BLOCKING.
|
|
1640
|
+
|
|
1641
|
+
## Why
|
|
1642
|
+
v0.36 implementation phase 의 lead context bloat 으로 phase 6 (worker dispatch) 가 한 lead 세션 안에 끝나지 않고 dispatch 직전 wizard 가 "이 세션 진행 vs prompt persist 후 다음 세션 resume" 양자택일을 자주 띄움. 위 세 변경으로 lead 의 dispatch-당 / phase-당 / Phase 1 baseline context 비용을 모두 절감.
|
|
1643
|
+
|
|
1644
|
+
## Test plan
|
|
1645
|
+
- [ ] `python3 -m pytest tests/ -x -q` 회귀 0건
|
|
1646
|
+
- [ ] `bash validators/validate-workflow.sh` PASS
|
|
1647
|
+
- [ ] `bash tests-e2e/scenario-*.sh` 모두 PASS
|
|
1648
|
+
- [ ] `node bin/okstra doctor` 가 신규 fragment + sidecar 파일들을 모두 보고
|
|
1649
|
+
- [ ] 작은 implementation task 한 건 dogfooding — token-usage 비교 측정 (별도 follow-up 으로 수치 수집)
|
|
1650
|
+
|
|
1651
|
+
## Notes
|
|
1652
|
+
- Track C 는 `docs/superpowers/plans/2026-05-24-implementation-lead-context-slimming.md` "Decision gate before C" 조건이 충족된 후에만 land — B/D 만으로 phase 6 single-session 진행이 가능해지면 C 는 over-engineering 으로 holdback 처리.
|
|
1653
|
+
EOF
|
|
1654
|
+
)"
|
|
1655
|
+
```
|
|
1656
|
+
|
|
1657
|
+
(이 step 은 사용자 승인 필요 — `gh pr create` 는 외부 영향이므로.)
|
|
1658
|
+
|
|
1659
|
+
---
|
|
1660
|
+
|
|
1661
|
+
## Self-review 체크리스트
|
|
1662
|
+
|
|
1663
|
+
이 plan 작성 직후, 작성자(=Claude lead)가 직접 통과시켜야 하는 항목:
|
|
1664
|
+
|
|
1665
|
+
1. **Spec coverage**
|
|
1666
|
+
- 사용자 분석의 B → C 의 모든 사항이 plan 의 task 로 매핑되었는가?
|
|
1667
|
+
- B: Task B1–B7 ✓
|
|
1668
|
+
- D: Task D1–D3 ✓
|
|
1669
|
+
- C: Task C1–C5 ✓
|
|
1670
|
+
- "Decision gate before C" 가 plan 안에 명시되었는가? ✓
|
|
1671
|
+
|
|
1672
|
+
2. **Placeholder scan**
|
|
1673
|
+
- "TBD" / "implement later" / "TODO" / "fill in details" 가 plan 본문에 존재하는가? — 없음 ✓
|
|
1674
|
+
- "Similar to Task N" 패턴이 있는가? — Task B5 의 gemini-worker 변경이 codex-worker 와 동일하지만, 차이 토큰을 명시적으로 enumerate 함 ✓
|
|
1675
|
+
|
|
1676
|
+
3. **Type consistency**
|
|
1677
|
+
- 파일 경로가 trackB / track C 에서 동일하게 쓰이는가? — `prompts/fragments/required-reading-block.md`, `prompts/profiles/_implementation-*.md` 모두 일관 ✓
|
|
1678
|
+
- `RECOMMENDED_STEP_CAP` / `HARD_STEP_CAP` 상수명이 validator code + 테스트에서 일치 ✓
|
|
1679
|
+
- `_FRAGMENT_MISSING` 센티넬 이름이 worker 간 일관 ✓
|
|
1680
|
+
|
|
1681
|
+
4. **Test-first 준수**
|
|
1682
|
+
- 모든 코드 변경 task 가 (test 작성 → fail 확인 → 구현 → pass 확인 → commit) 순서 ✓
|
|
1683
|
+
|
|
1684
|
+
5. **Track 간 의존성**
|
|
1685
|
+
- B/D/C 의 적용 순서는 명시됨 (B → D → 측정 → C 결정).
|
|
1686
|
+
- 동시 적용 시 충돌 없음 — 서로 다른 파일/룰 영역.
|
|
1687
|
+
|
|
1688
|
+
---
|
|
1689
|
+
|
|
1690
|
+
## Execution Handoff
|
|
1691
|
+
|
|
1692
|
+
Plan 저장 완료: `docs/superpowers/plans/2026-05-24-implementation-lead-context-slimming.md`.
|
|
1693
|
+
|
|
1694
|
+
두 실행 옵션이 있다.
|
|
1695
|
+
|
|
1696
|
+
**1. Subagent-Driven (권장)** — fresh subagent per task, lead 가 task 간 review. fast iteration. 각 트랙(B/D/C) 별로 독립적으로 dispatch 가능.
|
|
1697
|
+
|
|
1698
|
+
**2. Inline Execution** — 같은 세션 안에서 executing-plans 로 batch 처리. checkpoint 단위 review.
|
|
1699
|
+
|
|
1700
|
+
어느 쪽으로 진행할지 알려달라. (참고: C 는 "Decision gate before C" 통과 후에만 진입하므로, B 와 D 만 우선 dispatch 하고 측정 후 C 를 별도 trigger 하는 분리 실행도 가능.)
|