okstra 0.64.1 → 0.65.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (42) hide show
  1. package/bin/okstra +1 -0
  2. package/docs/kr/architecture.md +2 -0
  3. package/docs/kr/cli.md +11 -3
  4. package/docs/kr/performance-improvement-plan-v2.md +2 -1
  5. package/docs/project-structure-overview.md +1 -0
  6. package/docs/superpowers/plans/2026-06-10-p6-token-usage-incremental.md +1029 -0
  7. package/docs/superpowers/specs/2026-06-10-blocking-contract-posthoc-conformance-design.md +168 -0
  8. package/package.json +1 -1
  9. package/runtime/BUILD.json +2 -2
  10. package/runtime/agents/SKILL.md +3 -1
  11. package/runtime/agents/workers/claude-worker.md +1 -1
  12. package/runtime/agents/workers/codex-worker.md +1 -0
  13. package/runtime/agents/workers/gemini-worker.md +1 -0
  14. package/runtime/bin/lib/okstra/cli.sh +4 -0
  15. package/runtime/bin/lib/okstra/globals.sh +1 -0
  16. package/runtime/bin/lib/okstra/usage.sh +4 -1
  17. package/runtime/bin/okstra.sh +1 -0
  18. package/runtime/prompts/profiles/_implementation-executor.md +1 -0
  19. package/runtime/python/okstra_ctl/clarification_items.py +96 -37
  20. package/runtime/python/okstra_ctl/context_cost.py +86 -8
  21. package/runtime/python/okstra_ctl/locks.py +32 -0
  22. package/runtime/python/okstra_ctl/migrate.py +45 -6
  23. package/runtime/python/okstra_ctl/pr_template.py +2 -7
  24. package/runtime/python/okstra_ctl/run.py +58 -44
  25. package/runtime/python/okstra_ctl/run_context.py +3 -8
  26. package/runtime/python/okstra_ctl/seeding.py +25 -18
  27. package/runtime/python/okstra_ctl/wizard.py +8 -10
  28. package/runtime/python/okstra_ctl/worktree.py +13 -0
  29. package/runtime/python/okstra_project/dirs.py +10 -1
  30. package/runtime/python/okstra_token_usage/claude.py +226 -61
  31. package/runtime/python/okstra_token_usage/cli.py +10 -1
  32. package/runtime/python/okstra_token_usage/collect.py +34 -27
  33. package/runtime/python/okstra_token_usage/cursor.py +93 -0
  34. package/runtime/python/okstra_token_usage/paths.py +29 -2
  35. package/runtime/skills/okstra-coding-preflight/clean-code.md +15 -0
  36. package/runtime/skills/okstra-inspect/SKILL.md +16 -11
  37. package/runtime/skills/okstra-run/templates/pr-body.template.md +13 -16
  38. package/runtime/validators/lib/fixtures.sh +73 -10
  39. package/runtime/validators/lib/runners.sh +4 -0
  40. package/runtime/validators/validate-run.py +53 -0
  41. package/runtime/validators/validate_session_conformance.py +430 -0
  42. package/src/migrate.mjs +31 -0
@@ -0,0 +1,168 @@
1
+ # BLOCKING 계약 3종 post-hoc conformance 검사 설계
2
+
3
+ - 작성일: 2026-06-10
4
+ - 상태: 구현 완료 (2026-06-10 승인 — `validators/validate_session_conformance.py` 로 구현, 실 run 검증 포함)
5
+ - 관련 글로벌 규칙: CLAUDE.md "선언과 강제를 구분하라" (규칙 #3) — 문서에 MUST/BLOCKING 을 쓸 때마다 "어디서 어떻게 검증되는가" 한 줄이 있어야 한다.
6
+
7
+ ## 1. 배경과 동기
8
+
9
+ `agents/SKILL.md` 는 아래 3개 계약을 BLOCKING 으로 선언하지만, 현재 어떤 코드 경로(validator / 테스트 / 런타임 가드)도 위반을 실패로 만들지 않는다. 선언만 있고 강제가 없는 상태다.
10
+
11
+ | # | 계약 | 선언 위치 | 현재 강제 |
12
+ |---|------|----------|----------|
13
+ | 1 | lead 의 `PROGRESS: <phase-id>` 체크포인트 라인 12종 | [agents/SKILL.md:83-105](../../../agents/SKILL.md:83) | 없음 — `grep -rln "PROGRESS:" validators/ tests/ scripts/okstra_ctl/` → 0 hit (2026-06-10 확인) |
14
+ | 2 | claude-worker 의 audit 사이드카 5분 heartbeat (`- PROGRESS: <stage> <ISO-UTC>` append) | [agents/SKILL.md:384](../../../agents/SKILL.md:384), [agents/workers/claude-worker.md:65](../../../agents/workers/claude-worker.md:65) | 사이드카 **존재** 만 검사 ([validators/validate-run.py:974-981](../../../validators/validate-run.py:974)). cadence 는 미검사. 기존 heartbeat 테스트([tests/test_okstra_wrapper_status.py](../../../tests/test_okstra_wrapper_status.py))는 codex/gemini wrapper 의 `.status.json` 만 다룬다 |
15
+ | 3 | implementation sidecar entry guard — Phase 5/6 진입 전 `_implementation-executor/-verifier/-deliverable.md` Read 의무 | [agents/SKILL.md:176-188](../../../agents/SKILL.md:176) | 없음 — "lead session jsonl 에서 확인 가능" 이라고 선언만 하고 아무도 jsonl 을 보지 않는다 |
16
+
17
+ 세 계약 모두 **lead 세션 jsonl 또는 run 산출물(디스크 파일)에 사후 증거가 남는다**. 따라서 Phase 7 에서 lead 가 반드시 실행하는 `validators/validate-run.py` 에 post-hoc 검사를 추가하는 것이 가장 자연스러운 강제 지점이다 — 위반 시 기존 메커니즘 그대로 `validation failed → run status = contract-violated` 로 떨어진다 ([validators/validate-run.py:256-258](../../../validators/validate-run.py:256)).
18
+
19
+ ## 2. 범위 / 비범위
20
+
21
+ **범위**
22
+ - `validate-run.py` 실행 시점(Phase 7)에 검증 가능한 post-hoc 검사 3종.
23
+ - 검사 로직은 새 모듈 1개로 분리, `validate-run.py` `main()` 에서 호출.
24
+ - 단위 테스트 (합성 jsonl / 사이드카 fixture).
25
+
26
+ **비범위**
27
+ - 런타임(실시간) 강제 — 예: 5분 stale mtime 감시는 run 진행 중 lead 의 폴링 영역이며 post-hoc 으로 대체 불가. 이 설계는 "위반한 run 이 통과 판정을 받는 것"을 막는다.
28
+ - 악의적 위조 방어 — heartbeat timestamp 는 worker 자기 보고 값이다. post-hoc 검사는 **누락**을 잡지, 위조를 잡지 않는다.
29
+ - `PROGRESS: complete` / `phase-7-teardown` 라인 검사 — validator 실행 시점 **이후**에 출력되는 라인이므로 구조적으로 검사 불가 (아래 4.1 참고).
30
+
31
+ ## 3. 설계 개요
32
+
33
+ ### 3.1 모듈 배치
34
+
35
+ 새 모듈: **`validators/validate_session_conformance.py`**
36
+
37
+ - 기존 phase-특화 validator 와 같은 패턴 — [validators/validate_fanout.py](../../../validators/validate_fanout.py), [validators/validate_improvement_report.py](../../../validators/validate_improvement_report.py) 처럼 `validate-run.py` 가 import 해 결과를 `failures` 리스트로 fold 한다 (접두 `session-conformance: `).
38
+ - 기존 `scripts/okstra_ctl/conformance.py` 는 **stage QA conformance** (별개 개념, [docs/superpowers/specs/2026-06-07-stage-conformance-qa-design.md](2026-06-07-stage-conformance-qa-design.md)) 이므로 이름 충돌을 피해 `session_conformance` 로 명명한다.
39
+
40
+ ### 3.2 공유 인프라 — 단일 참조점 재사용
41
+
42
+ lead 세션 jsonl 탐색·파싱은 이미 `okstra_token_usage` 패키지에 구현돼 있다. 중복 구현하지 않고 그대로 쓴다.
43
+
44
+ | 기능 | 재사용 지점 |
45
+ |------|------------|
46
+ | 프로젝트별 jsonl 디렉터리 (`~/.claude/projects/<encoded-cwd>/`) | [scripts/okstra_token_usage/paths.py:19](../../../scripts/okstra_token_usage/paths.py:19) `claude_project_dir` |
47
+ | lead sessionId → jsonl 매핑 (+ teamName 태그 스캔 폴백) | [scripts/okstra_token_usage/claude.py:100](../../../scripts/okstra_token_usage/claude.py:100) `find_claude_team_sessions` |
48
+ | jsonl 레코드 iterator | `scripts/okstra_token_usage/jsonl_io.py` `iter_jsonl` |
49
+ | run 시간 윈도우 (in-session lead 의 세션 전체 jsonl 에서 이번 run 만 스코핑) | [scripts/okstra_token_usage/collect.py:121-137](../../../scripts/okstra_token_usage/collect.py:121) `_resolve_run_window` — **public `resolve_run_window` 로 승격** 하고 `collect()` 와 새 모듈이 함께 사용 (pre-1.0 이므로 호환 shim 없이 rename) |
50
+ | lead sessionId 출처 | team-state `lead.sessionId` ([scripts/okstra_ctl/render.py:425](../../../scripts/okstra_ctl/render.py:425) 에서 기록, [scripts/okstra_token_usage/collect.py:165](../../../scripts/okstra_token_usage/collect.py:165) 에서 소비) |
51
+
52
+ run 윈도우 스코핑이 필수인 이유: in-session(`okstra-run` skill) lead 는 사용자 세션 전체 jsonl 에 기록되므로, 윈도우 없이 스캔하면 **같은 세션의 이전 okstra run 이 남긴 PROGRESS 라인이 이번 run 의 증거로 오인**된다(false pass). 토큰 집계가 이미 같은 이유로 윈도우를 쓴다 ([collect.py:124-130](../../../scripts/okstra_token_usage/collect.py:124) 주석).
53
+
54
+ ### 3.3 jsonl 스캔 규칙 (검사 1·3 공통)
55
+
56
+ - 스캔 대상 레코드: `type == "assistant"` 인 레코드만. — Skill 호출 시 SKILL.md 본문(체크포인트 라인 예시 포함!)이 tool_result(user 레코드)로 transcript 에 주입되므로, assistant 외 레코드를 보면 즉시 false pass 가 난다.
57
+ - PROGRESS 라인: assistant 레코드의 `message.content[].type == "text"` 블록에서 line-anchored 정규식 `^PROGRESS: <phase-id>\b` (MULTILINE) 으로 추출. thinking 블록은 제외.
58
+ - Read tool-call: assistant 레코드의 `message.content[].type == "tool_use"`, `name == "Read"`, `input.file_path` 의 basename 매칭.
59
+ - 각 증거에 레코드 `timestamp` 를 부착해 run 윈도우(`since ≤ ts ≤ until`) 안의 것만 인정.
60
+ - 스캔 대상 파일: `find_claude_team_sessions(cwd, team_name, lead_sid)` 결과 중 **lead 후보 세트** = {기록된 `lead.sessionId` jsonl} ∪ {team 태그는 있으나 `agentName` 이 없는 jsonl}. 후자는 `claude --resume` 으로 lead 세션이 fork 된 경우(새 sessionId, agentName 없음)를 흡수한다 — worker 세션은 `agentName` 이 있으므로 자연 배제된다.
61
+
62
+ **P0 검증 항목 (구현 첫 단계):** 위 레코드 형태 가정(assistant text 블록 / `tool_use.name=="Read"` / `input.file_path`)을 실제 okstra run 의 lead jsonl 1개로 확인한 뒤 파서를 확정한다. 가정이 틀리면 설계로 되돌아온다.
63
+
64
+ ## 4. 검사 상세
65
+
66
+ ### 4.1 검사 1 — PROGRESS 체크포인트 라인
67
+
68
+ [agents/SKILL.md:87-101](../../../agents/SKILL.md:87) 의 12종 중 **validator 실행 시점에 이미 출력돼 있어야 하는 것만** 필수로 요구한다. run 형태(roster, 산출물)에 따라 요구 세트를 동적으로 구성한다:
69
+
70
+ | 체크포인트 | 요구 조건 | 판정 |
71
+ |-----------|----------|------|
72
+ | `phase-1-intake reading…` / `phase-1-intake complete` | 항상 | 각 ≥1 |
73
+ | `phase-2-prompts…` | 항상 | ≥1 |
74
+ | `phase-3-team-create…` | worker 가 1개 이상 dispatch 된 경우 (team-state worker status ∈ {completed, timeout, error, in-progress} — [validators/validate-run.py:373-377](../../../validators/validate-run.py:373) 의 `any_dispatched` 와 동일 기준) | ≥1 |
75
+ | `phase-4-dispatch worker=<role>…` | dispatch 시도된(status ∈ ATTEMPTED_STATUSES) worker 마다 | role 별 ≥1. role 매칭은 normalize(소문자화, 공백/하이픈 동일시) 후 `worker=` 토큰 비교; normalize 매칭 실패 시 해당 worker 를 실패 항목으로 보고 |
76
+ | `phase-5-poll…` | 검사 안 함 (pending 집합이 관측되지 않은 짧은 run 에선 합법적으로 0개) | — |
77
+ | `phase-5-collect worker=<role>…` | status == completed 인 worker 마다 | role 별 ≥1 |
78
+ | `phase-5.5-convergence round=…` | convergence state artifact 가 run 디렉터리에 존재하는 경우 | ≥1 |
79
+ | `phase-5.6-critic…` | 검사 안 함 (opt-in — [agents/SKILL.md:97](../../../agents/SKILL.md:97)) | — |
80
+ | `phase-6-synthesis…` | `Report writer worker` 가 required roster 에 있는 경우 | ≥1 |
81
+ | `phase-7-persist…` | 항상 (validator 호출은 Phase 7 내부이므로 시작 라인은 이미 출력됨) | ≥1 |
82
+ | `phase-7-teardown…` / `complete…` | 검사 안 함 (validator 실행 **이후** 출력) | — |
83
+
84
+ 실패 메시지는 누락된 체크포인트 id 와 SKILL.md 의 해당 행을 명시한다.
85
+
86
+ **lead jsonl 미발견 시:** 실패로 처리한다. 근거 — 토큰 사용량 계약이 이미 같은 원칙을 강제한다: lead jsonl 을 못 찾으면 `accuracy-failed` 로 validation 이 실패하므로 ([validators/validate-run.py:1819-1824](../../../validators/validate-run.py:1819)), "jsonl 이 없어 conformance 를 못 본다" 는 상황은 어차피 통과할 수 없는 run 이다. 별도 opt-out 은 만들지 않는다 (allowlist 원칙 — 통과 조건만 정의).
87
+
88
+ ### 4.2 검사 2 — claude-worker heartbeat cadence
89
+
90
+ 대상: `worker-results/claude-worker-audit-<task-type>-<seq>.md` (jsonl 불필요 — 디스크 파일 파싱). 사이드카 **존재** 는 기존 검사가 이미 강제하므로 ([validators/validate-run.py:974-981](../../../validators/validate-run.py:974)), 이 검사는 **내용 cadence** 만 추가한다. 대상 worker 는 claude-worker 가 ATTEMPTED 인 run 으로 한정한다 — 5분 heartbeat 계약은 [claude-worker.md:65](../../../agents/workers/claude-worker.md:65) 의 것이고, codex/gemini 는 `.status.json` watchdog 가 별도 경로로 이미 강제된다.
91
+
92
+ 파싱: `- PROGRESS: <stage> <ISO-8601-UTC>` 라인 목록을 추출. 검사 항목:
93
+
94
+ 1. **시작 마커**: 첫 PROGRESS 라인의 stage 가 `started` 일 것 ("write the sidecar … with one `- PROGRESS: started <ISO>` line").
95
+ 2. **종료 직전 마커**: `write-result-start` stage 라인이 존재할 것 (result 파일이 있는데 이 마커가 없으면 단계별 append 계약 위반).
96
+ 3. **timestamp 파싱 가능 + 단조 비감소**: 파싱 불가 라인, 역행 timestamp 는 각각 실패 항목.
97
+ 4. **cadence**: 연속 PROGRESS 라인 간 간격 ≤ **5분 + grace 60초**. 계약상 5분이지만 worker 가 append 직전 측정한 시각과 실제 쓰기 사이 지연을 흡수하기 위한 고정 grace 다.
98
+
99
+ **한계 (명시):** 마지막 PROGRESS 라인 **이후** 의 hang 은 post-hoc 으로 잡을 수 없다 (종료 시각 anchor 가 없다 — 파일 mtime 은 git/조작에 취약해 anchor 로 쓰지 않는다). 그 구간은 run 중 lead 의 5-min stale mtime 감시([agents/SKILL.md:384](../../../agents/SKILL.md:384))가 담당하는 런타임 영역이다.
100
+
101
+ ### 4.3 검사 3 — implementation sidecar entry guard
102
+
103
+ `task_type == "implementation"` 인 run 에만 적용. lead jsonl(run 윈도우 내)에서 `Read` tool_use 의 `input.file_path` basename 으로 다음을 요구한다:
104
+
105
+ 1. **존재성**: `_implementation-executor.md`, `_implementation-verifier.md`, `_implementation-deliverable.md` 각각에 대한 Read ≥1.
106
+ 2. **순서** (anchor 가 있을 때만): 검사 1 이 수집한 PROGRESS 라인을 anchor 로 사용 —
107
+ - executor·verifier sidecar Read 의 timestamp < 첫 `PROGRESS: phase-6-synthesis` 라인 timestamp ([agents/SKILL.md:182-183](../../../agents/SKILL.md:182): 둘 다 "Read at Phase 5").
108
+ - deliverable sidecar Read 의 timestamp < 첫 `PROGRESS: phase-7-persist` 라인 timestamp ([agents/SKILL.md:184](../../../agents/SKILL.md:184): "Read at Phase 6").
109
+ - anchor 라인 자체가 없으면(검사 1 이 이미 그 누락을 실패로 보고) 순서 검사는 생략하고 존재성만 본다 — 같은 원인으로 이중 실패를 쌓지 않는다.
110
+ 3. **fresh-read 규칙** ([agents/SKILL.md:188](../../../agents/SKILL.md:188) "이전 run 기억으로 갈음 불가"): run 윈도우 스코핑 자체가 이를 보장한다 — 이번 run 윈도우 안의 Read 만 인정된다.
111
+
112
+ basename 매칭인 이유: sidecar 의 절대 경로는 레이어(repo / `~/.okstra/lib` / 설치본)에 따라 달라지지만 파일명은 세 레이어에서 동일하며, 언더스코어 접두 파일명은 이 3종 외에 lead 가 Read 할 일이 없을 만큼 특이적이다.
113
+
114
+ ## 5. validate-run.py 통합
115
+
116
+ ```
117
+ main() # validators/validate-run.py:1969
118
+ ...
119
+ task_type = effective_run_task_type(...)
120
+ + sc_result = validate_session_conformance(
121
+ + team_state=team_state,
122
+ + project_root=project_root,
123
+ + report_path=report_path,
124
+ + run_manifest=run_manifest,
125
+ + task_type=task_type,
126
+ + claude_projects_dir=args.claude_projects_dir, # 기본 None → 실제 ~/.claude/projects
127
+ + )
128
+ + failures.extend(f"session-conformance: {e}" for e in sc_result.errors)
129
+ ```
130
+
131
+ - 새 CLI 인자 `--claude-projects-dir <path>` (optional): 테스트·진단용 주입 시드. 기본값은 실제 디렉터리. env var 가 아닌 명시적 인자로 둔다 (사용자 규칙: env var 보다 명시적 인자). launch prompt 가 렌더링하는 validator 호출 커맨드는 변경 불필요 (기본값 사용).
132
+ - 검사 2 (heartbeat) 는 jsonl 과 무관하므로 jsonl 미발견 시에도 항상 수행.
133
+ - 실패 시 동작은 기존과 동일: `failures` 비어 있지 않음 → `validation_status = "failed"` → run/task manifest 에 `contract-violated` 기록 ([validators/validate-run.py:2069-2076](../../../validators/validate-run.py:2069)).
134
+
135
+ ## 6. 테스트 전략
136
+
137
+ 새 파일 `tests/test_validate_session_conformance.py` — 기존 validate-run 테스트 패턴(importlib 으로 모듈 로드 후 함수 직접 호출, [tests/test_validate_run_report_format.py:33-38](../../../tests/test_validate_run_report_format.py:33))을 따른다. `main()` 통합 경로가 아닌 함수 단위로 검증하므로 기존 validate-run 테스트는 영향받지 않는다.
138
+
139
+ | 케이스 | fixture |
140
+ |--------|---------|
141
+ | 검사 1 happy path / 체크포인트 누락 / 윈도우 밖 라인 무시(직전 run 오염) / SKILL.md 본문 주입(false-pass 방지: user 레코드의 PROGRESS 텍스트는 불인정) | 합성 jsonl (tmp_path 에 작성, `--claude-projects-dir` 주입 시드 사용) |
142
+ | 검사 2 happy path / started 누락 / write-result-start 누락 / 6분 gap / timestamp 역행·파싱 불가 | 합성 audit 사이드카 md |
143
+ | 검사 3 3종 Read 존재 / 1종 누락 / phase-6 anchor 이후 Read(순서 위반) / anchor 부재 시 존재성만 | 합성 jsonl |
144
+ | lead jsonl 미발견 → 실패 | 빈 projects dir |
145
+ | `resolve_run_window` 승격 후 collect 회귀 없음 | 기존 token-usage 테스트 통과 확인 |
146
+
147
+ **구현 완료 판정의 실행 검증 (사용자 규칙 — mocked green 으로 "검증됨" 선언 금지):** 단위 테스트 외에, 실제 okstra run 1회의 실 artifacts (lead jsonl + audit 사이드카) 에 대해 validate-run.py 를 실행해 관측한 결과를 보고한다. 그 전까지 상태 표기는 "정적/단위테스트상 통과, 실 run 미검증".
148
+
149
+ ## 7. 리스크와 한계
150
+
151
+ 1. **jsonl flush 타이밍**: validator 는 lead 의 Bash tool-call 로 실행된다. 그 시점까지의 assistant 메시지가 jsonl 에 기록돼 있다는 가정 — Claude Code 는 메시지 단위로 append 하므로 성립하지만, P0 에서 실 jsonl 로 함께 확인한다.
152
+ 2. **harness 포맷 의존**: jsonl 레코드 스키마는 Claude Code 내부 포맷이다. 토큰 집계(`okstra_token_usage`)가 이미 같은 의존을 지므로 새 위험은 아니지만, 포맷 변경 시 두 소비자가 함께 깨진다 — 파서는 "인식 불가 레코드는 건너뜀 + 증거 0건이면 실패" 로 보수적으로 동작해 silent pass 는 없다.
153
+ 3. **소급 실패**: 이 검사 추가 후 과거 계약을 안 지킨 lead 의 run 은 Phase 7 에서 실패하게 된다 — 이것이 의도다. validate-run 은 새 run 의 Phase 7 에서만 호출되므로 이미 완료된 과거 run 에는 영향 없다.
154
+ 4. **자기 보고 timestamp**: 검사 2 의 한계 (§4.2). 누락 탐지가 목표이고, 위조 방어는 비범위.
155
+ 5. **role 표기 편차**: 검사 1 의 `worker=<role>` 매칭은 normalize 로 흡수하되, 실 run 에서 편차가 관측되면 매칭 규칙을 SSOT(team-state role 문자열) 기준으로 조정한다.
156
+
157
+ ## 8. 구현 단계 (승인 후)
158
+
159
+ 1. **P0**: 실 lead jsonl 1개로 §3.3 레코드 가정 검증 → 파서 시그니처 확정.
160
+ 2. `scripts/okstra_token_usage/collect.py` — `_resolve_run_window` → `resolve_run_window` 승격 (호출자 갱신).
161
+ 3. `validators/validate_session_conformance.py` 신규 — 검사 1·2·3 + Result 타입 (`ok` / `errors`), 기존 validator 모듈 패턴 준수.
162
+ 4. `validators/validate-run.py` — `--claude-projects-dir` 인자 + `main()` 통합 (§5).
163
+ 5. `tests/test_validate_session_conformance.py` 신규 (§6 케이스).
164
+ 6. 선언부 갱신 — 강제 지점 명시 한 줄씩 추가 (글로벌 규칙 #3 의 "옆에 한 줄"):
165
+ - [agents/SKILL.md:83](../../../agents/SKILL.md:83) Progress reporting 절에 "Phase 7 `validate-run.py` 가 lead 세션 jsonl 을 스캔해 누락 시 contract-violated 처리" 1줄.
166
+ - [agents/workers/claude-worker.md:65](../../../agents/workers/claude-worker.md:65) Heartbeat 절에 cadence post-hoc 검사 1줄.
167
+ - [agents/SKILL.md:186](../../../agents/SKILL.md:186) Entry guard 절의 "visible in the lead session jsonl" 뒤에 validator 가 실제로 검사함을 명시.
168
+ 7. `npm run build` + `python3 -m pytest tests/` + 실 run 1회 검증 (§6) → `CHANGES.md` 에 `사용자 영향:` 라인 포함 엔트리 추가.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "okstra",
3
- "version": "0.64.1",
3
+ "version": "0.65.0",
4
4
  "description": "Multi-agent cross-verification orchestrator runtime + Claude Code skills.",
5
5
  "license": "MIT",
6
6
  "author": "devonshin",
@@ -1,5 +1,5 @@
1
1
  {
2
- "package": "0.64.1",
3
- "builtAt": "2026-06-10T05:16:54.133Z",
2
+ "package": "0.65.0",
3
+ "builtAt": "2026-06-10T08:26:08.228Z",
4
4
  "repoRoot": "/home/runner/work/okstra/okstra"
5
5
  }
@@ -104,6 +104,8 @@ These lines are the only structured signal the user has during a long run. Do NO
104
104
 
105
105
  `okstra-run` (in-session) surfaces these lines to the user directly; the bash-spawned path leaves them in the session jsonl for post-hoc retrieval. Neither path requires any additional formatting from Lead — emit the literal `PROGRESS:` prefix and the rest of the line as plain text.
106
106
 
107
+ **Enforcement:** the Phase 7 validator (`validators/validate-run.py` → `validate_session_conformance.py`) scans the lead session jsonl (run-window-scoped, assistant text blocks only) and fails the run as `contract-violated` when a required checkpoint is missing — including the per-worker `phase-4-dispatch` / `phase-5-collect` lines, which must name each worker's role. `phase-7-teardown` and `complete` fire after validation and are not checked.
108
+
107
109
  ## Model assignments
108
110
 
109
111
  **The lead never invents a model.** Every role's model is read from `task-manifest.json` → `resultContract.requiredWorkerRoles[*].modelExecutionValue` (and the lead model metadata). A missing assignment is a manifest defect, not a license to fall back — see [okstra-team-contract](./skills/okstra-team-contract/SKILL.md) "Model Assignment Rules". The manifest is always populated at run-prep time by the CLI, which seeds these values from `OKSTRA_DEFAULT_*_MODEL` (`scripts/okstra_ctl/run.py`).
@@ -183,7 +185,7 @@ The `implementation` profile's thin core (`prompts/profiles/implementation.md`)
183
185
  | `prompts/profiles/_implementation-verifier.md` | **Phase 5**, between Executor stage completion and the first verifier dispatch | Verifier roles, Two-tier command lookup, deny-list, discrepancy rule, Read-only command log, verifier-specific forbidden actions |
184
186
  | `prompts/profiles/_implementation-deliverable.md` | **Phase 6**, after Phase 5.5 convergence completes, BEFORE constructing the report-writer dispatch prompt | Required deliverable shape, Validation / TDD evidence rules, Verifier results structure, Self-review pass, Lead post-stage persistence |
185
187
 
186
- **Entry guard (BLOCKING).** Before transitioning into Phase 5 or Phase 6 for an `implementation` run, lead MUST emit a single Read tool call for the sidecar(s) above whose `Read at` matches the entering phase. If lead enters the phase without that Read on record (visible in the lead session jsonl), phase 진입 거부 — lead writes a `contract-violation` to the run-level errors log with `--message "implementation-sidecar-not-loaded"` and stops. Re-entry requires the sidecar Read first.
188
+ **Entry guard (BLOCKING).** Before transitioning into Phase 5 or Phase 6 for an `implementation` run, lead MUST emit a single Read tool call for the sidecar(s) above whose `Read at` matches the entering phase. If lead enters the phase without that Read on record (visible in the lead session jsonl), phase 진입 거부 — lead writes a `contract-violation` to the run-level errors log with `--message "implementation-sidecar-not-loaded"` and stops. Re-entry requires the sidecar Read first. **Enforcement:** the Phase 7 validator (`validate_session_conformance.py`) verifies post-hoc that all three sidecar Reads exist in the lead session jsonl within this run's window, and that they precede the `phase-6-synthesis` / `phase-7-persist` checkpoints respectively.
187
189
 
188
190
  The guard is not satisfied by remembering content from a prior run — each implementation run reads the sidecar fresh, because the sidecars are part of the runtime shipped via `okstra install` and may have been updated between runs.
189
191
 
@@ -62,7 +62,7 @@ Before producing any output, you MUST:
62
62
  1. Extract the absolute path from the lead's `**Worker Preamble Path:**` anchor header and Read that file end-to-end with a single `Read` call (no `offset`, no `limit`). This is the canonical SSOT for the Required Reading + Error Reporting + Output sections contract.
63
63
  2. Read every primary input file the lead enumerated under `## Inputs` (or equivalent heading) in the dispatch prompt body, end-to-end, following the rules stated in the preamble. For analysis workers this is normally `analysis-packet.md`; the source files named inside that packet are fallback/evidence paths to open when needed. Analysis workers do NOT read `final-report-template.md` — that file is for the report writer only.
64
64
 
65
- **Heartbeat — write the audit sidecar EARLY and APPEND per stage (BLOCKING).** Because this worker runs as an in-process Agent or a fresh-session tmux pane, the lead has no `BashOutput`-style liveness signal while waiting for your return. The audit sidecar is the only signal that survives a silent hang. Write the sidecar at `runs/<task-type>/worker-results/claude-worker-audit-<task-type>-<seq>.md` immediately after extracting `Project Root` and the assigned paths — BEFORE the per-file end-to-end reads — with just the heading line (`# Claude Worker Audit — <task-key>`) and one `- PROGRESS: started <ISO-8601-UTC>` line. Then APPEND one short progress line per stage as you advance: `read-<filename>`, `analysis-start`, `findings-draft-start`, `findings-draft-complete`, `write-result-start`. The append cadence MUST NOT exceed 5 minutes — if a single analysis stage is taking longer, emit a `- PROGRESS: in-stage:<stage> <ISO-8601-UTC>` heartbeat. A 5-minute stale sidecar mtime is the canonical "this worker has hung" signal for the operator. Sidecar write/append uses `Write` (initial) and `Edit` / heredoc `>>` (per-stage append).
65
+ **Heartbeat — write the audit sidecar EARLY and APPEND per stage (BLOCKING).** Because this worker runs as an in-process Agent or a fresh-session tmux pane, the lead has no `BashOutput`-style liveness signal while waiting for your return. The audit sidecar is the only signal that survives a silent hang. Write the sidecar at `runs/<task-type>/worker-results/claude-worker-audit-<task-type>-<seq>.md` immediately after extracting `Project Root` and the assigned paths — BEFORE the per-file end-to-end reads — with just the heading line (`# Claude Worker Audit — <task-key>`) and one `- PROGRESS: started <ISO-8601-UTC>` line. Then APPEND one short progress line per stage as you advance: `read-<filename>`, `analysis-start`, `findings-draft-start`, `findings-draft-complete`, `write-result-start`. The append cadence MUST NOT exceed 5 minutes — if a single analysis stage is taking longer, emit a `- PROGRESS: in-stage:<stage> <ISO-8601-UTC>` heartbeat. A 5-minute stale sidecar mtime is the canonical "this worker has hung" signal for the operator. Sidecar write/append uses `Write` (initial) and `Edit` / heredoc `>>` (per-stage append). **Enforcement:** the Phase 7 validator (`validate_session_conformance.py`) parses these `- PROGRESS:` lines post-hoc and fails the run when the first stage is not `started`, `write-result-start` is missing despite an existing result file, timestamps regress/unparse, or consecutive lines are more than 5 minutes (+60s grace) apart.
66
66
 
67
67
  ## Worker Output Structure
68
68
 
@@ -135,6 +135,7 @@ This wrapper does NOT invoke MCP tools directly. MCP availability inside the Cod
135
135
  - Treat the prompt-history path as the canonical worker prompt history artifact for the current run, resolved to absolute against `Project Root` if given as relative.
136
136
  - The assigned model execution value is canonical for CLI execution. Do not substitute a different Codex model unless the task bundle explicitly changes it.
137
137
  - Pass the prompt received from Lead directly to codex after persisting the exact prompt to the assigned path.
138
+ - **Executor preflight forwarding check (implementation runs only).** When the lead prompt assigns this dispatch the `Executor` role for an `implementation` run, the persisted prompt body MUST contain the literal heading `Coding-conventions preflight` (transcribed by the lead from `prompts/profiles/_implementation-executor.md` → "Pre-implementation context exploration") — the Codex CLI does not share the lead's context, so an untranscribed gate never reaches the process that writes the code. If the heading is absent, return `CODEX_PREFLIGHT_MISSING: executor dispatch prompt lacks the coding-conventions preflight block` instead of invoking the CLI; the lead is responsible for re-dispatching with the block included. This check does NOT apply to verifier or analysis dispatches.
138
139
  - Include context (code, diff, file paths) if provided.
139
140
  - For long prompts, dispatch through the wrapper with literal absolute paths (plus the worktree path for implementation phase):
140
141
  ```bash
@@ -135,6 +135,7 @@ This wrapper does NOT invoke MCP tools directly. MCP availability inside the Gem
135
135
  - Treat the prompt-history path as the canonical worker prompt history artifact for the current run, resolved to absolute against `Project Root` if given as relative.
136
136
  - The assigned model execution value is canonical for CLI execution. Do not substitute a different Gemini model unless the task bundle explicitly changes it.
137
137
  - Pass the prompt received from Lead directly to gemini after persisting the exact prompt to the assigned path.
138
+ - **Executor preflight forwarding check (implementation runs only).** When the lead prompt assigns this dispatch the `Executor` role for an `implementation` run, the persisted prompt body MUST contain the literal heading `Coding-conventions preflight` (transcribed by the lead from `prompts/profiles/_implementation-executor.md` → "Pre-implementation context exploration") — the Gemini CLI does not share the lead's context, so an untranscribed gate never reaches the process that writes the code. If the heading is absent, return `GEMINI_PREFLIGHT_MISSING: executor dispatch prompt lacks the coding-conventions preflight block` instead of invoking the CLI; the lead is responsible for re-dispatching with the block included. This check does NOT apply to verifier or analysis dispatches.
138
139
  - Include context (code, diff, file paths) if provided.
139
140
  - For long prompts, dispatch through the wrapper with literal absolute paths (plus the worktree path for implementation phase):
140
141
  ```bash
@@ -83,6 +83,10 @@ while [[ $# -gt 0 ]]; do
83
83
  EXECUTOR_OVERRIDE="$(require_option_value --executor "${2-}")"
84
84
  shift 2
85
85
  ;;
86
+ --critic)
87
+ CRITIC_CHOICE="$(require_option_value --critic "${2-}")"
88
+ shift 2
89
+ ;;
86
90
  --related-tasks)
87
91
  RELATED_TASKS_RAW="$(require_option_value --related-tasks "${2-}")"
88
92
  shift 2
@@ -24,6 +24,7 @@ CODEX_MODEL_OVERRIDE=""
24
24
  GEMINI_MODEL_OVERRIDE=""
25
25
  REPORT_WRITER_MODEL_OVERRIDE=""
26
26
  EXECUTOR_OVERRIDE=""
27
+ CRITIC_CHOICE=""
27
28
  WORK_CATEGORY=""
28
29
  BASE_REF=""
29
30
  RELATED_TASKS_RAW=""
@@ -3,7 +3,7 @@
3
3
  usage() {
4
4
  cat >&2 <<USAGE_EOF
5
5
  usage:
6
- $DISPLAY_COMMAND_NAME [--render-only] [--yes] [--no-plan-verification] --task-type <task-type> [--workers worker1,worker2] [--lead-model <model>] [--claude-model <model>] [--codex-model <model>] [--gemini-model <model>] [--report-writer-model <model>] [--executor claude|codex|gemini] [--related-tasks taskA,taskB] --project-id <project-id> [--project-root <path>] --task-group <task-group> --task-id <task-id> --task-brief <brief-path> [--directive <directive>]
6
+ $DISPLAY_COMMAND_NAME [--render-only] [--yes] [--no-plan-verification] --task-type <task-type> [--workers worker1,worker2] [--lead-model <model>] [--claude-model <model>] [--codex-model <model>] [--gemini-model <model>] [--report-writer-model <model>] [--executor claude|codex|gemini] [--critic off|claude|codex|gemini] [--related-tasks taskA,taskB] --project-id <project-id> [--project-root <path>] --task-group <task-group> --task-id <task-id> --task-brief <brief-path> [--directive <directive>]
7
7
 
8
8
  summary:
9
9
  $DISPLAY_TOOL_NAME prepares a task-keyed instruction bundle for Claude Code and launches an interactive Claude session by default.
@@ -94,6 +94,9 @@ options:
94
94
  The Executor is the only worker allowed to mutate project files; the other two
95
95
  providers are dispatched as read-only verifiers regardless of this selection.
96
96
  Has no effect on other task types.
97
+ --critic Provider for the opt-in Phase 5.6 critic pass (coverage gaps /
98
+ acceptance devil's-advocate). One of: off | claude | codex | gemini.
99
+ Default: off.
97
100
  --related-tasks Optional comma-separated related task identifiers. Example: auth-token-refresh,frontend-login-ui
98
101
  --work-category Work-category classification for this task. One of:
99
102
  bugfix | feature | refactor | ops | improvement | unknown.
@@ -115,6 +115,7 @@ PY_ARGS=(
115
115
  [[ -n "${GEMINI_MODEL_OVERRIDE-}" ]] && PY_ARGS+=(--gemini-model "$GEMINI_MODEL_OVERRIDE")
116
116
  [[ -n "${REPORT_WRITER_MODEL_OVERRIDE-}" ]] && PY_ARGS+=(--report-writer-model "$REPORT_WRITER_MODEL_OVERRIDE")
117
117
  [[ -n "${EXECUTOR_OVERRIDE-}" ]] && PY_ARGS+=(--executor "$EXECUTOR_OVERRIDE")
118
+ [[ -n "${CRITIC_CHOICE-}" ]] && PY_ARGS+=(--critic "$CRITIC_CHOICE")
118
119
  [[ -n "${RELATED_TASKS_RAW-}" ]] && PY_ARGS+=(--related-tasks "$RELATED_TASKS_RAW")
119
120
  [[ -n "${APPROVED_PLAN_PATH-}" ]] && PY_ARGS+=(--approved-plan "$APPROVED_PLAN_PATH")
120
121
  [[ "$APPROVE_PLAN_ACK" == "true" ]] && PY_ARGS+=(--approve)
@@ -23,6 +23,7 @@ until Phase 5 ends, then drop from active context for Phase 6/7.
23
23
  - **Project review rule packs:** also look for project-local review skills in `<PROJECT_ROOT>/skills/*review*`, `<PROJECT_ROOT>/.claude/skills/*review*`, and up to two parent directories' `skills/*review*/SKILL.md`. Read the relevant `SKILL.md` plus referenced `references/*.md` files and apply their rules during implementation. This is a prevention pass, not a PR-comment generation workflow: do not dispatch reviewer subagents from the executor. For Fonts Ninja-style PR review packs, the executor must avoid newly introduced duplicate helper stacks, tautological tests that merely re-call the delegated helper, self-mocking, domain rules in adapters/ports, domain objects outside `domain/`, dead APIs, weak public names, and functions that fail the plain-English read.
24
24
  - **Language-agnostic principles that ALWAYS bind (the TDD loop below MUST satisfy them):** (1) no self-mocking of the SUT — stub/spy only injected collaborators, never the subject's own methods; (2) behavioral assertions on outcomes (return value, state, persisted rows, events, boundary calls) — never `toHaveBeenCalled*` on an internal helper as the only/primary assertion; (3) truthful names — a `get*` / `find*` that writes/inserts, or a name encoding the caller's use-case (`*ForInit`) or hiding a domain rule (`findValid*`), is a defect; (4) single-purpose functions ≤50 effective lines, plain-English readability.
25
25
  - **Graceful degradation (codex / gemini executor runtimes, or any runtime where the `~/.claude/skills/okstra-coding-preflight/` files are absent or unreadable):** do NOT skip the gate — apply the agnostic principles above plus the project's own `CLAUDE.md` / `CONTRIBUTING` / formatter+lint config, and record `coding-conventions: skill-unavailable → applied <project rules + agnostic principles>` in the final report. Never claim a skill read that did not happen.
26
+ - **CLI executor transcription (BLOCKING when the executor provider is `codex` or `gemini`):** the executor CLI process does NOT share the lead's context — a gate that stays in lead memory never reaches it. The lead MUST copy this entire "Coding-conventions preflight" bullet tree (file-read instructions, project review rule packs, agnostic principles, graceful degradation) verbatim into the dispatched executor prompt body. Enforcement: the CLI wrapper agents refuse an implementation-Executor dispatch whose persisted prompt lacks the literal heading `Coding-conventions preflight`, returning `<SENTINEL_PREFIX>_PREFLIGHT_MISSING` (see `agents/workers/_cli-wrapper-template.md` → Prompt Composition).
26
27
  - **Mandatory TDD loop**: BEFORE the first `Edit` or `Write` call, the executor MUST apply a red-green-refactor loop for every code change in this run. This is required; skipping it is a `contract-violated` outcome. This governs HOW each step is executed (failing test first → minimal implementation → refactor); it does not override the approved plan's WHAT/file scope.
27
28
  - Order of operations per plan step: (1) write/extend the test that captures the step's acceptance criterion and confirm it fails for the right reason, (2) commit the failing test (`test(<scope>): ...`), (3) implement the minimum change to make it pass, (4) commit the implementation (`feat|fix(<scope>): ...`), (5) refactor without changing behaviour and commit separately if any cleanup is made (`refactor(<scope>): ...`). The failing-then-passing transition between steps (2) and (4) is the `TDD evidence` required by the final report.
28
29
  - Doc-only / config-only / pure-rename steps that have no observable runtime behaviour are exempt from the failing-test requirement, but the executor MUST cite the exemption per step in the final report (`TDD exemption: <reason>`).
@@ -8,15 +8,17 @@ one of ``{approval, next-phase, none}``. Rows with ``Blocks=approval`` are
8
8
  the approval gate: they MUST resolve before the user flips the frontmatter
9
9
  ``approved`` field to ``true`` and starts the next ``implementation`` run.
10
10
 
11
- This module exposes one read function for that gate so both
12
- ``_validate_approved_plan`` (pre-implementation run-prep) and any later
13
- validator can share the same parsing logic.
14
-
15
- Legacy compatibility: reports written before the §1 unification used
16
- ``4.5.9 Open Questions`` + ``5.1 Additional Materials`` + ``5.2 Questions
17
- for the User`` and lacked a ``Blocks`` column. Those reports cannot be
18
- gate-checked by Blocks; the parser returns ``None`` to signal "schema
19
- absent, skip check" rather than fabricating a verdict.
11
+ This module exposes one read function for that gate
12
+ (``scan_approval_gate``) so both ``_validate_approved_plan``
13
+ (pre-implementation run-prep) and the wizard share the same parsing logic.
14
+
15
+ Gate semantics are fail-closed: when the §1 schema cannot be read with
16
+ confidence (heading missing/drifted, table header unrecognized, or any
17
+ body row whose metadata cell fails to parse), the scan reports an
18
+ ``unreadable_reason`` and callers must refuse approval instead of
19
+ soft-passing. ``parse_clarification_items`` keeps the lenient
20
+ None-on-absence contract for the HTML-view renderers, which only need
21
+ best-effort row extraction.
20
22
  """
21
23
  from __future__ import annotations
22
24
 
@@ -150,24 +152,28 @@ def parse_meta_cell(cell: str) -> Optional[ClarificationItem]:
150
152
  )
151
153
 
152
154
 
153
- def parse_clarification_items(report_text: str) -> Optional[list[ClarificationItem]]:
154
- """Return the list of §1 rows. ``None`` means "no §1 meta table detected"
155
- (legacy report or missing section) caller must NOT treat that as "table
156
- is empty".
157
-
158
- An empty list ``[]`` means "table exists but has no data rows" (e.g.,
159
- just the ``- 추가 정보 요청 없음.`` placeholder); that IS a confident
160
- "no approval-blocking items".
155
+ @dataclass(frozen=True)
156
+ class _Section1Table:
157
+ """Outcome of walking the §1 slice for its data table.
158
+
159
+ ``items`` is ``None`` when no recognizable table header exists among the
160
+ pipe lines. ``unparsed_row_count`` counts body rows whose metadata cell
161
+ failed ``parse_meta_cell`` (all-empty filler rows excluded).
162
+ ``has_pipe_lines`` distinguishes the renderer's legitimate table-less
163
+ placeholder (emptyState bullet) from a table whose header drifted.
161
164
  """
162
- section = _section_1_slice(report_text)
163
- if section is None:
164
- return None
165
+ items: Optional[list[ClarificationItem]]
166
+ unparsed_row_count: int
167
+ has_pipe_lines: bool
165
168
 
169
+
170
+ def _walk_section_1_table(section: str) -> _Section1Table:
166
171
  lines = section.splitlines()
172
+ has_pipe_lines = any(line.lstrip().startswith("|") for line in lines)
167
173
  # Locate the §1 data table by its header. The merged-meta layout collapses
168
174
  # ID/Ticket/Kind/Blocks/Status into one metadata cell and keeps the
169
175
  # English `Statement` + `User input` columns; detect on those two (any
170
- # other table — intro, legacy 5.1/5.2 — is rejected by returning None).
176
+ # other table — intro, legacy 5.1/5.2 — is rejected).
171
177
  header_idx = -1
172
178
  for idx, line in enumerate(lines):
173
179
  if not line.lstrip().startswith("|"):
@@ -177,9 +183,10 @@ def parse_clarification_items(report_text: str) -> Optional[list[ClarificationIt
177
183
  header_idx = idx
178
184
  break
179
185
  if header_idx < 0:
180
- return None
186
+ return _Section1Table(None, 0, has_pipe_lines)
181
187
 
182
188
  items: list[ClarificationItem] = []
189
+ unparsed = 0
183
190
  body_started = False
184
191
  for line in lines[header_idx + 1:]:
185
192
  if not line.lstrip().startswith("|"):
@@ -192,32 +199,84 @@ def parse_clarification_items(report_text: str) -> Optional[list[ClarificationIt
192
199
  if not body_started:
193
200
  continue
194
201
  cells = _split_pipe_row(line)
195
- if not cells:
202
+ if not any(cells):
196
203
  continue
197
204
  item = parse_meta_cell(cells[0])
198
- if item is not None:
199
- items.append(item)
200
- return items
205
+ if item is None:
206
+ unparsed += 1
207
+ continue
208
+ items.append(item)
209
+ return _Section1Table(items, unparsed, True)
210
+
211
+
212
+ def parse_clarification_items(report_text: str) -> Optional[list[ClarificationItem]]:
213
+ """Return the list of §1 rows. ``None`` means "no §1 meta table detected"
214
+ (missing section or unrecognized table header) — caller must NOT treat
215
+ that as "table is empty".
216
+
217
+ Lenient view-renderer contract: rows whose metadata cell fails to parse
218
+ are skipped, not surfaced. The approval gate must use
219
+ ``scan_approval_gate`` instead, which fail-closes on those rows.
220
+ """
221
+ section = _section_1_slice(report_text)
222
+ if section is None:
223
+ return None
224
+ return _walk_section_1_table(section).items
201
225
 
202
226
 
203
227
  UNRESOLVED_STATUSES = {"open", "answered"}
204
228
 
205
229
 
206
- def unresolved_approval_blockers(report_text: str) -> Optional[list[ClarificationItem]]:
207
- """Return rows that gate the frontmatter ``approved`` flag — ``Blocks=approval``
208
- AND ``Status`` in ``{open, answered}``.
230
+ @dataclass(frozen=True)
231
+ class ApprovalGateScan:
232
+ """Fail-closed read of the §1 approval gate.
209
233
 
210
- ``None`` propagates the "schema absent" signal from
211
- ``parse_clarification_items``: caller may decide to soft-pass legacy
212
- reports and only enforce on the new-format §5.
234
+ ``unreadable_reason`` is ``None`` only when the scan is confident: §1
235
+ parsed cleanly (or is the legitimate table-less placeholder) and
236
+ ``blockers`` is therefore authoritative. A non-None reason means the
237
+ gate must refuse approval — never soft-pass.
213
238
  """
214
- items = parse_clarification_items(report_text)
215
- if items is None:
216
- return None
217
- return [
218
- it for it in items
239
+ blockers: list[ClarificationItem]
240
+ unreadable_reason: Optional[str]
241
+
242
+
243
+ def scan_approval_gate(report_text: str) -> ApprovalGateScan:
244
+ """Scan §1 for unresolved ``Blocks=approval`` rows (``Status`` in
245
+ ``{open, answered}``), refusing to guess whenever the schema drifted."""
246
+ section = _section_1_slice(report_text)
247
+ if section is None:
248
+ if _LOOSE_SECTION_1_RE.search(report_text):
249
+ reason = (
250
+ "`## 1. Clarification Items` heading exists but does not match "
251
+ "the schema heading format (anchor/format drift)"
252
+ )
253
+ else:
254
+ reason = (
255
+ "report has no `## 1. Clarification Items` section — the gate "
256
+ "cannot confirm there are no unresolved `Blocks=approval` rows"
257
+ )
258
+ return ApprovalGateScan([], reason)
259
+ table = _walk_section_1_table(section)
260
+ if table.items is None:
261
+ if table.has_pipe_lines:
262
+ return ApprovalGateScan([], (
263
+ "§1 contains a table but its header row is not the schema "
264
+ "header (`| ... | Statement | Expected form | User input |`)"
265
+ ))
266
+ # Renderer's emptyState placeholder: heading is intact and no table
267
+ # was emitted — confidently "no approval-blocking items".
268
+ return ApprovalGateScan([], None)
269
+ if table.unparsed_row_count:
270
+ return ApprovalGateScan([], (
271
+ f"§1 table has {table.unparsed_row_count} row(s) whose metadata "
272
+ "cell could not be parsed (Blocks/Status markers missing or "
273
+ "malformed)"
274
+ ))
275
+ blockers = [
276
+ it for it in table.items
219
277
  if it.blocks == "approval" and it.status in UNRESOLVED_STATUSES
220
278
  ]
279
+ return ApprovalGateScan(blockers, None)
221
280
 
222
281
 
223
282
  # 느슨한 §1 헤딩 탐지: 엄격한 SECTION_HEADING_PATTERN 이 실패해도 이게 매칭되면