okstra 0.51.0 → 0.52.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,179 @@
1
+ # implementation-planning 수직 슬라이스 + RED→GREEN 강제 설계
2
+
3
+ - 작성일: 2026-06-06
4
+ - 상태: Proposed (사용자 검토 대기)
5
+ - 대상 phase: `implementation-planning` (okstra-run Phase 5 계획 산출)
6
+
7
+ ## 1. 배경 / 문제
8
+
9
+ PR 사이즈가 비대해지는 것을 막기 위해, implementation-planning 이 작업 계획을
10
+ 세울 때 **PR 작업 단위를 수직 슬라이스(기능 단위 end-to-end)로 끊고**, 각 슬라이스를
11
+ **TDD RED→GREEN** 흐름으로 구성하도록 강제하고 싶다.
12
+
13
+ 현재 상태(이미 존재하는 인프라):
14
+
15
+ - 계획은 작업을 **Stage** 로 분할하며 Stage 당 effective step ≤ 6 캡이 걸려 있다.
16
+ ([implementation-planning.md:69](../../../prompts/profiles/implementation-planning.md:69))
17
+ - implementation 실행 시 **run 당 PR 1개**(`One-PR-per-run`)가 원칙이라
18
+ Stage = PR 단위 구조가 이미 성립한다.
19
+ ([_implementation-executor.md:44](../../../prompts/profiles/_implementation-executor.md:44))
20
+ - executor 의 **Mandatory TDD loop** 는 이미 강제: 실패 테스트 → `test(...)` 커밋 →
21
+ 최소 구현 → `feat|fix(...)` 커밋 → refactor.
22
+ ([_implementation-executor.md:25](../../../prompts/profiles/_implementation-executor.md:25))
23
+
24
+ 격차(gap) 세 가지:
25
+
26
+ 1. Stage 분할 앵커가 **"함께 바뀌는 파일 근접도(cohesion / file proximity)"** 로
27
+ 표현돼 있어 "독립 배포 가능한 사용자 가치 증분"이라는 수직 슬라이스 멘탈모델과
28
+ 말이 다르다. ([implementation-planning.md:72](../../../prompts/profiles/implementation-planning.md:72))
29
+ 2. 각 Stage 가 "이 Stage 가 전달하는 사용자 관찰 가능한 증분"을 선언하지 않아,
30
+ 레이어 가로 절단(horizontal slice)인지 수직 슬라이스인지 구분이 불가능하다.
31
+ 3. 계획 단계의 TDD 는 `prefer TDD ordering` 수준의 **권고**일 뿐 강제가 아니다.
32
+ ([implementation-planning.md:69](../../../prompts/profiles/implementation-planning.md:69))
33
+ 실행(executor)에서만 강제되어 계획-실행 간 정합이 비대칭이다.
34
+
35
+ ## 2. 목표 / 비목표
36
+
37
+ ### 목표
38
+ - Stage 분할 1차 앵커를 **수직 슬라이스(vertical slice)** 로 재정의한다.
39
+ - 각 Stage 가 `Slice value:` / `Acceptance:` 를 선언하게 한다.
40
+ - 계획 단계 `Stepwise Execution Order` 를 **RED→GREEN mandatory** 로 격상한다.
41
+ - 위 규칙을 **검증기(validator)로 강제**한다 — 선언과 강제를 일치시킨다(Rule #3).
42
+
43
+ ### 비목표
44
+ - Stage Map 계약 골격(`## 5.5 Stage Map` + `## 5.5.<i> Stage <i>:` 4-subsection)
45
+ 전면 개편은 하지 않는다(접근 A 기각). 골격은 보존하고 문구 + 검증만 추가한다.
46
+ - 리포트 템플릿의 평면 `### 5.5.4 Stepwise Execution Order` 렌더링 경로
47
+ ([final-report.template.md:178](../../../templates/reports/final-report.template.md:178))는
48
+ 이번 변경 대상이 아니다(멀티-Stage 본문은 report-writer 가 프로파일 가이던스로
49
+ 직접 작성하며, `tests/fixtures/plans/valid_one_stage.md` 가 실제 산출 형태다).
50
+ - executor 의 TDD loop 동작 변경은 하지 않는다 — 이미 강제되어 있고, 계획의
51
+ RED/GREEN step 이 executor 의 `test(...)` / `feat|fix(...)` 커밋에 1:1 매핑된다.
52
+ - `One-PR-per-run` / parallel-safety(S9) 모델 변경 없음.
53
+
54
+ ## 3. 채택 접근 — C (외과적 변경)
55
+
56
+ `Stage = PR = run` 모델을 그대로 두고, **프로파일 문구 3곳 + 검증기 S10 1개**만
57
+ 바꾼다. 기존 아키텍처·검증기 골격을 보존하면서 두 의도(수직 슬라이스 + 명시적
58
+ RED→GREEN)를 채운다.
59
+
60
+ 접근 A(검증기 전면 개편)와 B(응집도 위 추가 차원)는 변경 규모/이중화 때문에 기각.
61
+
62
+ ## 4. 상세 설계
63
+
64
+ ### 4.1 Delta 1 — 분할 앵커 재정의
65
+
66
+ [implementation-planning.md:72](../../../prompts/profiles/implementation-planning.md:72)
67
+ "Cohesion-first partition rule" → **"Vertical-slice-first partition rule"**:
68
+
69
+ - 1차 앵커 = **사용자 관찰 가능한 증분 1개를 end-to-end 로 전달하는 얇은 수직
70
+ 슬라이스**. 한 Stage 는 레이어를 가로지르더라도 하나의 기능 증분을 완결한다.
71
+ - 파일 응집도("shared file/module proximity")는 폐기하지 않고 **슬라이스 내부에서
72
+ step 을 묶는 2차 기준**으로 강등한다.
73
+ - **레이어 가로 절단(horizontal layering) 금지** 문구를 명시한다 — 예: "DB 레이어만
74
+ 한 Stage, 서비스 레이어만 다음 Stage" 식 분할은 거부.
75
+ - 기존 분할 트리거는 유지하되 (c)만 재문구화:
76
+ - (a) 실제 `depends-on` 데이터/계약 의존성이 있을 때
77
+ - (b) effective step 이 6을 초과할 때
78
+ - (c) **별개의 수직 슬라이스(서로 다른 사용자 가치 증분)일 때**
79
+ - "Maximising parallel stages is NOT a reason to split" 원칙은 유지.
80
+
81
+ ### 4.2 Delta 2 — Stage 당 슬라이스 선언
82
+
83
+ [implementation-planning.md:67-74](../../../prompts/profiles/implementation-planning.md:67)
84
+ 각 `## 5.5.<i> Stage <i>:` 섹션에 **필수 두 줄**을 추가한다(Carry-In 직전, 헤딩 바로 아래):
85
+
86
+ ```
87
+ Slice value: <이 Stage 가 전달하는 사용자 관찰 가능한 증분 한 줄>
88
+ Acceptance: <관찰 가능한 통과 조건 또는 정확한 커맨드>
89
+ ```
90
+
91
+ - `Slice value` 는 "무엇이 동작하게 되는가"를 사용자/소비자 관점으로 기술한다.
92
+ 레이어 이름("repository 추가")이 아니라 증분("X 를 조회하면 Y 가 반환된다").
93
+ - `Acceptance` 는 그 슬라이스가 끝났음을 증명하는 관찰 가능 신호 — 보통 4.3 의
94
+ RED step 이 PASS 로 전환되는 테스트 커맨드와 동일하다.
95
+
96
+ ### 4.3 Delta 3 — 계획 단계 RED→GREEN mandatory
97
+
98
+ [implementation-planning.md:69](../../../prompts/profiles/implementation-planning.md:69)
99
+ `### Stepwise Execution Order` 요구를 `prefer TDD ordering` → **MUST** 로 격상:
100
+
101
+ - 각 비면제 Stage 의 **첫 effective step 의 `action` 셀은 리터럴 `RED:` 로 시작**하고,
102
+ 그 슬라이스의 acceptance 를 포착하는 **실패 테스트**를 기술한다(`expected` = FAIL).
103
+ - 이후 구현 step 중 최소 하나의 `action` 셀은 리터럴 `GREEN:` 로 시작하고 테스트를
104
+ 통과시키는 최소 구현을 기술한다(`expected` = PASS).
105
+ - refactor step 은 선택(있으면 `REFACTOR:` 접두).
106
+ - **면제**: doc-only / config-only / 순수 rename 등 런타임 관찰 동작이 없는 Stage 는
107
+ 섹션에 한 줄 `TDD exemption: <사유>` 를 두고 RED/GREEN 을 생략할 수 있다(executor
108
+ 의 동일 면제 규칙 [_implementation-executor.md:27](../../../prompts/profiles/_implementation-executor.md:27)과 정합).
109
+
110
+ 리터럴 토큰(`RED:` / `GREEN:` / `REFACTOR:` / `TDD exemption:`)을 쓰는 이유: 검증기가
111
+ "실패 테스트인지 단어로 추론"하는 brittle 방식 대신 substring 으로 확정 검사하기
112
+ 위함 — 기존 §"Section heading contract"의 리터럴-substring 철학과 일관
113
+ ([implementation-planning.md:54](../../../prompts/profiles/implementation-planning.md:54)).
114
+
115
+ ### 4.4 Delta 4 — 검증기 S10 (강제)
116
+
117
+ [validators/validate-implementation-plan-stages.py](../../../validators/validate-implementation-plan-stages.py)
118
+ 에 `_check_slice_tdd()` 를 추가하고 `collect_validation_errors()` 에 연결한다.
119
+ 각 Stage 섹션(`_slice_stage_section` 으로 추출)마다:
120
+
121
+ - **S10a**: `Slice value:` 라인이 있고 콜론 뒤 값이 비어있지 않음.
122
+ - **S10b**: `Acceptance:` 라인이 있고 콜론 뒤 값이 비어있지 않음.
123
+ - **S10c (TDD ordering)**: 다음 둘 중 하나를 만족.
124
+ - (i) `Stepwise Execution Order` 의 **첫 effective row 의 action 셀이 `RED:` 로
125
+ 시작** AND 같은 표의 어떤 row 의 action 셀이 `GREEN:` 로 시작, **또는**
126
+ - (ii) 섹션에 `TDD exemption:` 라인이 존재.
127
+
128
+ 구현 메모:
129
+ - 첫 effective row 의 action 셀 추출은 기존 `_count_effective_steps` 의 셀 파싱 로직
130
+ (header/divider skip, `strip("|").split("|")`)을 재사용해 새 헬퍼로 분리한다 — 표 컬럼
131
+ 순서는 `step | action | files | command | expected` 이므로 action 은 index 1.
132
+ - S10 은 S1(Stage Map 부재) 단락 시 실행되지 않으며, stage 파싱 성공 시
133
+ `_check_each_stage_section` 과 같은 레벨에서 호출한다.
134
+ - 에러 코드 `S10`, stage 번호 포함, 메시지는 누락 항목 명시.
135
+
136
+ ### 4.5 정합성 — executor / convergence
137
+
138
+ - executor 의 per-step TDD loop 은 계획의 RED/GREEN step 을 그대로 실행하면 되므로
139
+ [_implementation-executor.md](../../../prompts/profiles/_implementation-executor.md) 본문 변경 없음.
140
+ 단, 계획이 이미 RED/GREEN 을 명시하므로 executor 가 "계획의 RED step = 첫 실패
141
+ 테스트"로 읽도록 한 줄 정합 코멘트만 선택적으로 추가 가능(필수 아님).
142
+ - §5.5.9 Plan Body Verification / self-review pass 의 "Stage Map self-check"
143
+ ([implementation-planning.md:102](../../../prompts/profiles/implementation-planning.md:102))에
144
+ Slice value/Acceptance/RED-GREEN 확인 항목을 한 줄 추가해 사람-검토와 기계-검토를
145
+ 이중화한다.
146
+
147
+ ## 5. 영향 / 마이그레이션
148
+
149
+ ### 5.1 테스트 픽스처 (BLOCKING)
150
+ S10 추가로 기존 valid 픽스처가 새로 실패하므로 **반드시 갱신**한다:
151
+ - [tests/fixtures/plans/valid_one_stage.md](../../../tests/fixtures/plans/valid_one_stage.md):
152
+ `Slice value:` / `Acceptance:` 두 줄 추가, action 셀을 `RED: ...` / `GREEN: ...` 로 수정.
153
+ - [tests/fixtures/plans/valid_three_stage_parallel.md](../../../tests/fixtures/plans/valid_three_stage_parallel.md):
154
+ 동일 갱신(3 stage 모두).
155
+ - 신규 invalid 픽스처 추가: `invalid_missing_slice_value.md`, `invalid_missing_red_step.md`.
156
+
157
+ ### 5.2 테스트 코드
158
+ - [tests/test_validate_implementation_plan_stages.py](../../../tests/test_validate_implementation_plan_stages.py)
159
+ 에 S10a/S10b/S10c 통과·실패 케이스, `TDD exemption` 면제 통과 케이스 추가.
160
+ - `tests/test_render_final_report.py` / 골든 리포트가 stage 본문을 포함한다면 동반 갱신.
161
+
162
+ ### 5.3 빌드 / 동기화
163
+ - 소스(`prompts/`, `validators/`) 수정 후 `npm run build` 로 `runtime/` 동기화.
164
+ - `runtime/` 직접 수정 금지.
165
+
166
+ ### 5.4 사전 확인 (가정 금지)
167
+ - 구현 착수 시 `report-writer` 가 stage 본문을 실제로 어디서 emit 하는지 1회 실측
168
+ 확인한다(프로파일 가이던스 vs 템플릿). `valid_one_stage.md` 픽스처가 멀티-Stage
169
+ 구조를 보이므로 프로파일 가이던스 경로가 유력하나, 코드로 확인 후 진행한다.
170
+
171
+ ## 6. 수용 기준 (이 설계의 done 조건)
172
+
173
+ 1. 프로파일 분할 앵커가 vertical-slice-first 로 재정의되고 horizontal 금지 명시.
174
+ 2. 각 Stage 가 `Slice value:` / `Acceptance:` 를 선언하도록 프로파일이 요구.
175
+ 3. 계획 `Stepwise Execution Order` 가 `RED:` 첫 step + `GREEN:` 을 MUST 로 요구.
176
+ 4. `validate-implementation-plan-stages.py` S10 이 위 1–3 의 산출물을 강제하고
177
+ `python3 -m pytest tests/test_validate_implementation_plan_stages.py` 가 통과.
178
+ 5. 기존 valid 픽스처 갱신으로 전체 stages 검증 스위트 green.
179
+ 6. `npm run build` 후 `runtime/` 의 프로파일·검증기가 소스와 일치.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "okstra",
3
- "version": "0.51.0",
3
+ "version": "0.52.0",
4
4
  "description": "Multi-agent cross-verification orchestrator runtime + Claude Code skills.",
5
5
  "license": "MIT",
6
6
  "author": "devonshin",
@@ -1,5 +1,5 @@
1
1
  {
2
- "package": "0.51.0",
3
- "builtAt": "2026-06-05T15:09:03.772Z",
2
+ "package": "0.52.0",
3
+ "builtAt": "2026-06-05T15:37:23.478Z",
4
4
  "repoRoot": "/home/runner/work/okstra/okstra"
5
5
  }
@@ -55,7 +55,7 @@
55
55
  - The final report MUST include section headings containing each of the following exact strings: `Option Candidates`, `Trade-off`, `Recommended Option`, `Stage Map`, `Stage Exit Contract`, `Stage Validation`, `Dependency`, `Validation Checklist`, `Rollback`. (Approval is no longer a body section — it is the YAML frontmatter `approved` field.)
56
56
  - Korean translations are allowed in parentheses (e.g. `### Recommended Option (권장 옵션)`), but the English keyword must be present verbatim in the heading line.
57
57
  - The shape and ordering follow `final-report-template.md` section 4.5 (`Implementation Plan Deliverables`). Do NOT translate the heading keywords — `validators/validate-run.py` does substring matching on the raw report text and 7-of-8 missing strings is a real, repeatedly observed failure mode (root cause: writer translated the headings to Korean).
58
- - Beyond substring matching, when the Plan Body Verification gate result is `passed` / `passed-with-dissent`, `validators/validate-run.py` runs the **structural** Stage Map validator (`validators/validate-implementation-plan-stages.py`) at the planning boundary — the exact `## 5.5 Stage Map` heading, each `## 5.5.<i> Stage <i>:` section with its four required subsections, the per-stage effective step count (≤6), and the `depends-on` DAG are all enforced here, not deferred to the `implementation` entry gate.
58
+ - Beyond substring matching, when the Plan Body Verification gate result is `passed` / `passed-with-dissent`, `validators/validate-run.py` runs the **structural** Stage Map validator (`validators/validate-implementation-plan-stages.py`) at the planning boundary — the exact `## 5.5 Stage Map` heading, each `## 5.5.<i> Stage <i>:` section with its four required subsections, the per-stage effective step count (≤6), the `depends-on` DAG, and the per-stage vertical-slice contract (S10) are all enforced here, not deferred to the `implementation` entry gate. S10 scans for the literal in-section strings `Slice value:`, `Acceptance:`, and the Stepwise `action`-cell prefixes `RED:` / `GREEN:` (or a `TDD exemption:` line) — keep these tokens verbatim for the same reason as the heading keywords above.
59
59
  - Required deliverable shape (final report, in addition to the standard sections):
60
60
  - at least two implementation options. **Each option must include**:
61
61
  - **File Structure**: an explicit list of files to create / modify / delete with each file's responsibility (one-line each). Use the form `Create: path — responsibility` / `Modify: path:line-range — change summary` / `Delete: path — reason`.
@@ -64,12 +64,15 @@
64
64
  - trade-off matrix across options (rows = options, columns at minimum: complexity, risk, reversibility, test coverage cost, rollout cost)
65
65
  - recommended option with rationale tied to the design principles above
66
66
  - **Stage Map (mandatory — always emitted, even when N=1):** a table of all stages with `stage | title | depends-on | step-count | exit-contract-summary`. `depends-on` is `(none)` or a comma-separated stage number list. Stages with `depends-on (none)` can be implemented in parallel by two simultaneous `implementation` runs.
67
+ - **Per-stage slice declaration (mandatory two lines, directly under the `## 5.5.<i> Stage <i>:` heading, before `### Carry-In`):**
68
+ - `Slice value: <the one user-observable increment this stage delivers, end-to-end>` — describe WHAT starts working from the consumer's view (e.g. "X 를 조회하면 Y 가 반환된다"), NOT a layer name ("repository 추가"). Validator S10a rejects a missing/empty value.
69
+ - `Acceptance: <the observable pass condition or the exact command>` — the signal that proves the slice is done; normally the same test command that the `RED:` step below flips to PASS. Validator S10b rejects a missing/empty value.
67
70
  - **Per-stage subsections** (`## 5.5.<i> Stage <i>: <title>` for each `i`), each containing the four required subsections:
68
71
  - `### Carry-In` — for `depends-on (none)`: task-brief only. Otherwise: each depended-on stage's static exit contract + runtime sidecar path `runs/<impl-key>/carry/stage-<i>.json` placeholder.
69
- - `### Stepwise Execution Order` — bite-sized table with `step | action | files | command | expected`. **Effective row count ≤ 6** (excluding header / divider / blank). Each step is one action completable in 2–5 minutes; for code steps include actual code or diff sketch; prefer TDD ordering (failing test implementation green commit).
72
+ - `### Stepwise Execution Order` — bite-sized table with `step | action | files | command | expected`. **Effective row count ≤ 6** (excluding header / divider / blank). Each step is one action completable in 2–5 minutes; for code steps include actual code or diff sketch. **TDD ordering is MUST, not a preference:** the **first** effective step's `action` cell MUST start with the literal `RED:` and describe the failing test that captures this stage's `Acceptance` (`expected` = FAIL); at least one later `action` cell MUST start with the literal `GREEN:` and describe the minimal implementation that makes it pass (`expected` = PASS); an optional refactor step starts with `REFACTOR:`. **Exemption:** doc-only / config-only / pure-rename stages with no observable runtime behaviour may omit RED/GREEN by declaring one line `TDD exemption: <reason>` in the stage section (mirrors the executor's per-step exemption in `_implementation-executor.md`). Validator S10c enforces RED-first + GREEN, or the exemption line.
70
73
  - `### Stage Exit Contract` — predicted added/modified files, newly exposed identifiers/types/endpoints, downstream-usable resources.
71
74
  - `### Stage Validation` — pre / mid / post exact commands or observable outcomes for this stage only.
72
- - **Cohesion-first partition rule (1st-class):** the grouping anchor is **shared file/module proximity** steps touching the same file/directory/module go in the same stage so the diff, PR, and rollback unit are semantically cohesive. A stage is split ONLY when (a) a real `depends-on` data/contract dependency exists, (b) effective steps would exceed 6, or (c) the file sets are disjoint (unrelated work touching no shared file is not crammed together). Maximising the number of parallel stages is NOT a reason to split — parallelism is an emergent property of independent stages, never a partitioning goal.
75
+ - **Vertical-slice-first partition rule (1st-class):** the grouping anchor is a **thin end-to-end vertical slice** — one stage delivers a single user-observable increment, crossing whatever layers are needed (data → service → API → UI) to make that one increment work. File/module proximity is demoted to the **intra-slice grouping rule**: within a slice, keep steps touching the same file/directory/module together so the diff, PR, and rollback unit stay cohesive. **Horizontal layer-splitting is forbidden** — never carve "the DB layer" into one stage and "the service layer" into the next; that produces stages that ship no standalone user value. A stage is split ONLY when (a) a real `depends-on` data/contract dependency exists, (b) effective steps would exceed 6, or (c) it is a distinct vertical slice (a different user-value increment). Maximising the number of parallel stages is NOT a reason to split — parallelism is an emergent property of independent stages, never a partitioning goal.
73
76
  - **Parallel-safety invariant (BLOCKING):** any two stages that are both `depends-on (none)` MUST predict disjoint file sets in their `Stage Exit Contract`. Two parallel `implementation` runs would otherwise edit the same file concurrently. Work touching a shared file must either go in one stage or be ordered with `depends-on`. Enforced by `validators/validate-implementation-plan-stages.py` check S9.
74
77
  - **Stage exit contract is the carry surface:** keep it as narrow as possible. Wider surface = more downstream coupling.
75
78
  - dependency / migration risk assessment (ordering constraints, data backfills, feature-flag prerequisites, repo-internal sequencing)
@@ -99,4 +102,4 @@
99
102
  4. **Ambiguity check** — any requirement that could be read two ways must be made explicit or moved to the `## 1. Clarification Items` table as a `Blocks=approval` row.
100
103
  5. **Scope check** — if the recommended plan now spans multiple independent subsystems, recommend splitting into separate planning runs rather than shipping an oversized plan.
101
104
  6. **Plan-body verification reconciliation (BLOCKING for implementation-planning).** Inspect the `### 5.5.9 Plan Body Verification` verdict table. For every plan-item row classified as `majority-disagree → C-<N>`, the corresponding `C-<N>` row MUST exist in `## 1. Clarification Items` with `Kind` chosen per the standard policy and `Blocks=approval`. Do NOT create a parallel `### 5.5.x Open Questions` block — the unified table is the single home. Conversely, the `Classification` column's `C-<N>` reference and the `## 1. Clarification Items` `ID` column MUST match 1:1; an orphan on either side is a contract violation. For `partial-consensus` and `worker-unique` plan-items, the dissenting opinion lives in §5.5.9 `Dissent log` and is NOT promoted to §5.
102
- 7. **Stage Map self-check** — for every stage, count the effective rows of its `Stepwise Execution Order` table by hand; reject the draft if any stage exceeds 6. Walk the `depends-on` graph and confirm it is a DAG (no cycle, no self-reference). For each `depends-on` link, confirm it encodes a real data/contract dependency — do NOT add links to serialise unrelated work, and do NOT split a stage merely to create more parallel stages. **Parallel-safety:** for every pair of `depends-on (none)` stages, confirm their `Stage Exit Contract` predicted file sets are disjoint; if they share a file, merge them or add a `depends-on` link (validator S9 rejects overlap).
105
+ 7. **Stage Map self-check** — for every stage, count the effective rows of its `Stepwise Execution Order` table by hand; reject the draft if any stage exceeds 6. Confirm each stage declares a non-empty `Slice value:` and `Acceptance:` line, and that its first step `action` starts with `RED:` with a later `GREEN:` (or carries a `TDD exemption:` line) — this is what validator S10 enforces. Walk the `depends-on` graph and confirm it is a DAG (no cycle, no self-reference). For each `depends-on` link, confirm it encodes a real data/contract dependency — do NOT add links to serialise unrelated work, and do NOT split a stage merely to create more parallel stages. **Parallel-safety:** for every pair of `depends-on (none)` stages, confirm their `Stage Exit Contract` predicted file sets are disjoint; if they share a file, merge them or add a `depends-on` link (validator S9 rejects overlap).
@@ -44,6 +44,26 @@ If `okstra` is not on PATH, tell the user:
44
44
 
45
45
  Do not use `npx` from this skill.
46
46
 
47
+ ## Step 1: Pick the project-group (always first)
48
+
49
+ Every Memory Book entry belongs to a **project-group** — a search-scoping label
50
+ that partitions memory by organization or context (e.g. `acme`, `globex`,
51
+ `private` for personal notes). This selection comes **before** storing or
52
+ searching so the rest of the skill can scope to it.
53
+
54
+ 1. Enumerate existing groups to build recommendations:
55
+
56
+ ```bash
57
+ okstra memory groups --json
58
+ ```
59
+
60
+ 2. Present a 3-option picker (most-used existing group, next existing group,
61
+ then always `직접 입력`). If `groups` is empty, recommend `private` as the
62
+ first option. Use `private` for the user's personal context.
63
+ 3. Carry the chosen group name into every `add` (`--project-group <name>`) and
64
+ into scoped `search`/`list` (`--project-group <name>`) below. Omit the flag
65
+ only when the user explicitly wants a cross-group search.
66
+
47
67
  ## Store current conversation
48
68
 
49
69
  1. Extract only durable memory from the conversation:
@@ -65,19 +85,22 @@ Do not use `npx` from this skill.
65
85
  Command shape:
66
86
 
67
87
  ```bash
68
- okstra memory add --content "<summary markdown>" --title "<short title>" --type <type> --tag <tag> --project <id> --source conversation --yes
88
+ okstra memory add --content "<summary markdown>" --title "<short title>" --type <type> --project-group <group> --tag <tag> --project <id> --source conversation --yes
69
89
  ```
70
90
 
71
91
  Use repeated `--tag` / `--project` flags when needed. Omit `--project` when no
72
- project is clearly related.
92
+ project is clearly related. `--project-group` is the group chosen in Step 1
93
+ (default `global` if the user truly wants it ungrouped).
73
94
 
74
95
  ## Search / read / archive
75
96
 
76
- Use:
97
+ Scope reads to the chosen project-group by default; drop `--project-group` only
98
+ for an explicit cross-group search:
77
99
 
78
100
  ```bash
79
- okstra memory search "<query>"
80
- okstra memory list --tag "<tag>"
101
+ okstra memory search "<query>" --project-group "<group>"
102
+ okstra memory list --project-group "<group>" --tag "<tag>"
103
+ okstra memory groups # list groups with entry counts
81
104
  okstra memory show "<memory-id>"
82
105
  okstra memory archive "<memory-id>"
83
106
  ```
@@ -1,5 +1,5 @@
1
1
  #!/usr/bin/env python3
2
- """S1–S9 checks for the Stage Map structure of an approved
2
+ """S1–S10 checks for the Stage Map structure of an approved
3
3
  implementation-planning final-report.md. Run from prepare_task_bundle
4
4
  of `implementation` task or standalone."""
5
5
 
@@ -40,7 +40,7 @@ class StageMeta:
40
40
 
41
41
  @dataclass
42
42
  class ValidationError:
43
- code: str # S1..S9
43
+ code: str # S1..S10
44
44
  stage: int # 0 = global
45
45
  message: str
46
46
 
@@ -104,30 +104,36 @@ def _slice_stage_section(text: str, stage_number: int) -> str:
104
104
  return text[start: start + nxt.start()] if nxt else text[start:]
105
105
 
106
106
 
107
- def _count_effective_steps(section: str) -> int:
107
+ def _effective_step_rows(section: str) -> List[List[str]]:
108
+ """Effective (non header/divider/comment) rows of the `### Stepwise
109
+ Execution Order` table, each as a list of stripped cells. Columns are
110
+ `step | action | files | command | expected`, so action is index 1."""
108
111
  m = re.search(r"^###\s+Stepwise Execution Order\b", section, re.M)
109
112
  if not m:
110
- return 0
113
+ return []
111
114
  body = section[m.end():]
112
115
  nxt = re.search(r"^###\s+\w", body, re.M)
113
116
  if nxt:
114
117
  body = body[: nxt.start()]
115
- count = 0
118
+ rows: List[List[str]] = []
116
119
  for line in body.splitlines():
117
120
  s = line.strip()
118
121
  if not s or s.startswith("<!--"):
119
122
  continue
120
123
  if not s.startswith("|"):
121
124
  continue
122
- # Reuse the same header/divider detection as _parse_stage_map:
123
- # split on `|`, inspect first non-empty cell.
124
- first_cell = s.strip("|").split("|")[0].strip()
125
+ cells = [c.strip() for c in s.strip("|").split("|")]
126
+ first_cell = cells[0]
125
127
  if first_cell.lower() == "step":
126
128
  continue
127
129
  if set(first_cell) <= set("-: "):
128
130
  continue
129
- count += 1
130
- return count
131
+ rows.append(cells)
132
+ return rows
133
+
134
+
135
+ def _count_effective_steps(section: str) -> int:
136
+ return len(_effective_step_rows(section))
131
137
 
132
138
 
133
139
  def _check_each_stage_section(text: str, stages: List[StageMeta]) -> List[ValidationError]:
@@ -159,6 +165,45 @@ def _check_each_stage_section(text: str, stages: List[StageMeta]) -> List[Valida
159
165
  return errs
160
166
 
161
167
 
168
+ SLICE_VALUE = re.compile(r"^\s*Slice value\s*:\s*(.+?)\s*$", re.M)
169
+ ACCEPTANCE = re.compile(r"^\s*Acceptance\s*:\s*(.+?)\s*$", re.M)
170
+ TDD_EXEMPTION = re.compile(r"^\s*TDD exemption\s*:\s*\S", re.M)
171
+
172
+
173
+ def _check_slice_tdd(text: str, stages: List[StageMeta]) -> List[ValidationError]:
174
+ """S10: each stage declares a vertical slice and follows RED→GREEN ordering.
175
+
176
+ S10a — `Slice value:` line with a non-empty value.
177
+ S10b — `Acceptance:` line with a non-empty value.
178
+ S10c — first effective Stepwise step's action starts with `RED:` AND some
179
+ action starts with `GREEN:`, OR a `TDD exemption:` line is present.
180
+ """
181
+ errs: List[ValidationError] = []
182
+ for s in stages:
183
+ section = _slice_stage_section(text, s.stage_number)
184
+ if not section:
185
+ continue # S3 already reported the missing section
186
+
187
+ if not SLICE_VALUE.search(section):
188
+ errs.append(ValidationError("S10", s.stage_number,
189
+ "S10a: 'Slice value:' line missing or empty"))
190
+ if not ACCEPTANCE.search(section):
191
+ errs.append(ValidationError("S10", s.stage_number,
192
+ "S10b: 'Acceptance:' line missing or empty"))
193
+
194
+ if TDD_EXEMPTION.search(section):
195
+ continue
196
+ rows = _effective_step_rows(section)
197
+ actions = [r[1] for r in rows if len(r) > 1]
198
+ first_is_red = bool(actions) and actions[0].startswith("RED:")
199
+ has_green = any(a.startswith("GREEN:") for a in actions)
200
+ if not (first_is_red and has_green):
201
+ errs.append(ValidationError("S10", s.stage_number,
202
+ "S10c: first step action must start with 'RED:' and some "
203
+ "step action with 'GREEN:', or add a 'TDD exemption:' line"))
204
+ return errs
205
+
206
+
162
207
  def _check_depends_on(stages: List[StageMeta]) -> List[ValidationError]:
163
208
  errs: List[ValidationError] = []
164
209
  valid = {s.stage_number for s in stages}
@@ -229,7 +274,7 @@ def _check_parallel_safety(text: str, stages: List[StageMeta]) -> List[Validatio
229
274
 
230
275
 
231
276
  def collect_validation_errors(text: str) -> List[ValidationError]:
232
- """All S1–S9 checks against the report text; empty list means valid.
277
+ """All S1–S10 checks against the report text; empty list means valid.
233
278
 
234
279
  S1 (missing `## 5.5 Stage Map` heading) makes the rest unparseable, so it
235
280
  short-circuits. Shared by `main()` (CLI / implementation entry) and the
@@ -244,6 +289,7 @@ def collect_validation_errors(text: str) -> List[ValidationError]:
244
289
  errors.extend(s2_errs)
245
290
  if stages:
246
291
  errors.extend(_check_each_stage_section(text, stages))
292
+ errors.extend(_check_slice_tdd(text, stages))
247
293
  errors.extend(_check_depends_on(stages))
248
294
  errors.extend(_check_parallel_safety(text, stages))
249
295
  return errors
package/src/memory.mjs CHANGED
@@ -46,8 +46,11 @@ It is separate from project-local .okstra task artifacts.
46
46
 
47
47
  Usage:
48
48
  okstra memory add [--content <text> | --file <path>] [options]
49
- okstra memory list [--limit <n>] [--tag <tag>] [--type <type>] [--json]
50
- okstra memory search <query> [--limit <n>] [--include-archived] [--json]
49
+ okstra memory list [--limit <n>] [--tag <tag>] [--type <type>]
50
+ [--project-group <name>] [--json]
51
+ okstra memory search <query> [--limit <n>] [--project-group <name>]
52
+ [--include-archived] [--json]
53
+ okstra memory groups [--include-archived] [--json]
51
54
  okstra memory show <id> [--json]
52
55
  okstra memory archive <id> [--json]
53
56
 
@@ -55,7 +58,7 @@ Add options:
55
58
  --title <title> Entry title. Defaults to the first non-empty line.
56
59
  --type <type> context|decision|preference|requirement|person|
57
60
  project-hint|follow-up. Default: context.
58
- --scope <scope> Free-form scope label. Default: global.
61
+ --project-group <name> Project group label (e.g. acme, private). Default: global.
59
62
  --project <id> Related project id. Repeatable.
60
63
  --tag <tag> Tag. Repeatable or comma-separated.
61
64
  --source <source> Source label. Default: conversation.
@@ -97,7 +100,7 @@ function parseAddArgs(args) {
97
100
  file: null,
98
101
  title: null,
99
102
  type: "context",
100
- scope: "global",
103
+ projectGroup: "global",
101
104
  source: "conversation",
102
105
  tags: [],
103
106
  projects: [],
@@ -110,7 +113,7 @@ function parseAddArgs(args) {
110
113
  else if (flag === "--file") opts.file = takeValue(args, i++, flag);
111
114
  else if (flag === "--title") opts.title = takeValue(args, i++, flag);
112
115
  else if (flag === "--type") opts.type = takeValue(args, i++, flag);
113
- else if (flag === "--scope") opts.scope = takeValue(args, i++, flag);
116
+ else if (flag === "--project-group") opts.projectGroup = takeValue(args, i++, flag);
114
117
  else if (flag === "--source") opts.source = takeValue(args, i++, flag);
115
118
  else if (flag === "--project") opts.projects.push(takeValue(args, i++, flag));
116
119
  else if (flag === "--tag") opts.tags.push(...splitCsv(takeValue(args, i++, flag)));
@@ -128,24 +131,26 @@ function parseAddArgs(args) {
128
131
  }
129
132
 
130
133
  function parseListArgs(args) {
131
- const opts = { ...parseGlobalFlags(args), limit: 20, tag: null, type: null };
134
+ const opts = { ...parseGlobalFlags(args), limit: 20, tag: null, type: null, projectGroup: null };
132
135
  for (let i = 0; i < args.length; i++) {
133
136
  const flag = args[i];
134
137
  if (flag === "--json" || flag === "--include-archived") continue;
135
138
  if (flag === "--limit") opts.limit = parseLimit(takeValue(args, i++, flag));
136
139
  else if (flag === "--tag") opts.tag = takeValue(args, i++, flag);
137
140
  else if (flag === "--type") opts.type = takeValue(args, i++, flag);
141
+ else if (flag === "--project-group") opts.projectGroup = takeValue(args, i++, flag);
138
142
  else throw new Error(`unknown flag ${flag}`);
139
143
  }
140
144
  return opts;
141
145
  }
142
146
 
143
147
  function parseQueryArgs(args) {
144
- const opts = { ...parseGlobalFlags(args), limit: 20, query: [] };
148
+ const opts = { ...parseGlobalFlags(args), limit: 20, query: [], projectGroup: null };
145
149
  for (let i = 0; i < args.length; i++) {
146
150
  const flag = args[i];
147
151
  if (flag === "--json" || flag === "--include-archived") continue;
148
152
  if (flag === "--limit") opts.limit = parseLimit(takeValue(args, i++, flag));
153
+ else if (flag === "--project-group") opts.projectGroup = takeValue(args, i++, flag);
149
154
  else if (flag.startsWith("--")) throw new Error(`unknown flag ${flag}`);
150
155
  else opts.query.push(flag);
151
156
  }
@@ -153,6 +158,19 @@ function parseQueryArgs(args) {
153
158
  return { ...opts, query: opts.query.join(" ").trim() };
154
159
  }
155
160
 
161
+ function parseGroupsArgs(args) {
162
+ for (const flag of args) {
163
+ if (flag !== "--json" && flag !== "--include-archived") {
164
+ throw new Error(`unknown flag ${flag}`);
165
+ }
166
+ }
167
+ return parseGlobalFlags(args);
168
+ }
169
+
170
+ function entryGroup(entry) {
171
+ return entry.projectGroup ?? "global";
172
+ }
173
+
156
174
  function parseLimit(raw) {
157
175
  const value = Number.parseInt(raw, 10);
158
176
  if (!Number.isInteger(value) || value < 1) {
@@ -220,7 +238,7 @@ function buildEntry(opts, content, now) {
220
238
  id,
221
239
  title,
222
240
  type: opts.type,
223
- scope: opts.scope,
241
+ projectGroup: opts.projectGroup,
224
242
  source: opts.source,
225
243
  tags: [...new Set(opts.tags)],
226
244
  relatedProjects: [...new Set(opts.projects)],
@@ -237,7 +255,7 @@ function renderMarkdown(entry, content) {
237
255
  `id: ${entry.id}`,
238
256
  `createdAt: ${entry.createdAt}`,
239
257
  `source: ${entry.source}`,
240
- `scope: ${entry.scope}`,
258
+ `project-group: ${entry.projectGroup}`,
241
259
  `type: ${entry.type}`,
242
260
  `status: ${entry.status}`,
243
261
  `relatedProjects: [${entry.relatedProjects.join(", ")}]`,
@@ -258,6 +276,7 @@ async function confirmSave(entry, content, opts) {
258
276
  process.stdout.write(`Memory Book entry:\n`);
259
277
  process.stdout.write(` title: ${entry.title}\n`);
260
278
  process.stdout.write(` type: ${entry.type}\n`);
279
+ process.stdout.write(` group: ${entry.projectGroup}\n`);
261
280
  process.stdout.write(` tags: ${entry.tags.join(", ") || "(none)"}\n`);
262
281
  process.stdout.write(` text: ${truncate(content.trim().replace(/\s+/g, " "), 140)}\n`);
263
282
  const rl = createInterface({ input, output });
@@ -341,6 +360,7 @@ async function opList(args) {
341
360
  const entries = visibleEntries(await readIndex(), opts)
342
361
  .filter((entry) => !opts.tag || entry.tags.includes(opts.tag))
343
362
  .filter((entry) => !opts.type || entry.type === opts.type)
363
+ .filter((entry) => !opts.projectGroup || entryGroup(entry) === opts.projectGroup)
344
364
  .slice(0, opts.limit);
345
365
  if (opts.json) emitJson(entries);
346
366
  else process.stdout.write(entries.map(formatEntryLine).join("\n") + (entries.length ? "\n" : ""));
@@ -350,7 +370,9 @@ async function opList(args) {
350
370
  async function opSearch(args) {
351
371
  const opts = parseQueryArgs(args);
352
372
  const needle = opts.query.toLowerCase();
353
- const entries = visibleEntries(await readIndex(), opts);
373
+ const entries = visibleEntries(await readIndex(), opts).filter(
374
+ (entry) => !opts.projectGroup || entryGroup(entry) === opts.projectGroup,
375
+ );
354
376
  const matches = [];
355
377
  for (const entry of entries) {
356
378
  if (await entryMatches(entry, needle)) matches.push(entry);
@@ -361,12 +383,27 @@ async function opSearch(args) {
361
383
  return 0;
362
384
  }
363
385
 
386
+ async function opGroups(args) {
387
+ const opts = parseGroupsArgs(args);
388
+ const counts = new Map();
389
+ for (const entry of visibleEntries(await readIndex(), opts)) {
390
+ const group = entryGroup(entry);
391
+ counts.set(group, (counts.get(group) ?? 0) + 1);
392
+ }
393
+ const groups = [...counts.entries()]
394
+ .map(([name, count]) => ({ name, count }))
395
+ .sort((a, b) => b.count - a.count || a.name.localeCompare(b.name));
396
+ if (opts.json) emitJson(groups);
397
+ else process.stdout.write(groups.map((g) => `${g.name} (${g.count})`).join("\n") + (groups.length ? "\n" : ""));
398
+ return 0;
399
+ }
400
+
364
401
  async function entryMatches(entry, needle) {
365
402
  const haystack = [
366
403
  entry.id,
367
404
  entry.title,
368
405
  entry.type,
369
- entry.scope,
406
+ entryGroup(entry),
370
407
  entry.source,
371
408
  ...entry.tags,
372
409
  ...entry.relatedProjects,
@@ -446,6 +483,8 @@ export async function run(args) {
446
483
  return await opList(rest);
447
484
  case "search":
448
485
  return await opSearch(rest);
486
+ case "groups":
487
+ return await opGroups(rest);
449
488
  case "show":
450
489
  return await opShow(rest);
451
490
  case "archive":