npm - okstra - Versions diffs - 0.51.0 → 0.52.0 - Mend

okstra 0.51.0 → 0.52.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (7) hide show

package/docs/superpowers/specs/2026-06-06-vertical-slice-tdd-planning-design.md +179 -0
package/package.json +1 -1
package/runtime/BUILD.json +2 -2
package/runtime/prompts/profiles/implementation-planning.md +7 -4
package/runtime/skills/okstra-memory/SKILL.md +28 -5
package/runtime/validators/validate-implementation-plan-stages.py +57 -11
package/src/memory.mjs +50 -11

package/docs/superpowers/specs/2026-06-06-vertical-slice-tdd-planning-design.md ADDED Viewed

@@ -0,0 +1,179 @@
+# implementation-planning 수직 슬라이스 + RED→GREEN 강제 설계
+- 작성일: 2026-06-06
+- 상태: Proposed (사용자 검토 대기)
+- 대상 phase: `implementation-planning` (okstra-run Phase 5 계획 산출)
+## 1. 배경 / 문제
+PR 사이즈가 비대해지는 것을 막기 위해, implementation-planning 이 작업 계획을
+세울 때 **PR 작업 단위를 수직 슬라이스(기능 단위 end-to-end)로 끊고**, 각 슬라이스를
+**TDD RED→GREEN** 흐름으로 구성하도록 강제하고 싶다.
+현재 상태(이미 존재하는 인프라):
+- 계획은 작업을 **Stage** 로 분할하며 Stage 당 effective step ≤ 6 캡이 걸려 있다.
+  ([implementation-planning.md:69](../../../prompts/profiles/implementation-planning.md:69))
+- implementation 실행 시 **run 당 PR 1개**(`One-PR-per-run`)가 원칙이라
+  Stage = PR 단위 구조가 이미 성립한다.
+  ([_implementation-executor.md:44](../../../prompts/profiles/_implementation-executor.md:44))
+- executor 의 **Mandatory TDD loop** 는 이미 강제: 실패 테스트 → `test(...)` 커밋 →
+  최소 구현 → `feat|fix(...)` 커밋 → refactor.
+  ([_implementation-executor.md:25](../../../prompts/profiles/_implementation-executor.md:25))
+격차(gap) 세 가지:
+1. Stage 분할 앵커가 **"함께 바뀌는 파일 근접도(cohesion / file proximity)"** 로
+   표현돼 있어 "독립 배포 가능한 사용자 가치 증분"이라는 수직 슬라이스 멘탈모델과
+   말이 다르다. ([implementation-planning.md:72](../../../prompts/profiles/implementation-planning.md:72))
+2. 각 Stage 가 "이 Stage 가 전달하는 사용자 관찰 가능한 증분"을 선언하지 않아,
+   레이어 가로 절단(horizontal slice)인지 수직 슬라이스인지 구분이 불가능하다.
+3. 계획 단계의 TDD 는 `prefer TDD ordering` 수준의 **권고**일 뿐 강제가 아니다.
+   ([implementation-planning.md:69](../../../prompts/profiles/implementation-planning.md:69))
+   실행(executor)에서만 강제되어 계획-실행 간 정합이 비대칭이다.
+## 2. 목표 / 비목표
+### 목표
+- Stage 분할 1차 앵커를 **수직 슬라이스(vertical slice)** 로 재정의한다.
+- 각 Stage 가 `Slice value:` / `Acceptance:` 를 선언하게 한다.
+- 계획 단계 `Stepwise Execution Order` 를 **RED→GREEN mandatory** 로 격상한다.
+- 위 규칙을 **검증기(validator)로 강제**한다 — 선언과 강제를 일치시킨다(Rule #3).
+### 비목표
+- Stage Map 계약 골격(`## 5.5 Stage Map` + `## 5.5.<i> Stage <i>:` 4-subsection)
+  전면 개편은 하지 않는다(접근 A 기각). 골격은 보존하고 문구 + 검증만 추가한다.
+- 리포트 템플릿의 평면 `### 5.5.4 Stepwise Execution Order` 렌더링 경로
+  ([final-report.template.md:178](../../../templates/reports/final-report.template.md:178))는
+  이번 변경 대상이 아니다(멀티-Stage 본문은 report-writer 가 프로파일 가이던스로
+  직접 작성하며, `tests/fixtures/plans/valid_one_stage.md` 가 실제 산출 형태다).
+- executor 의 TDD loop 동작 변경은 하지 않는다 — 이미 강제되어 있고, 계획의
+  RED/GREEN step 이 executor 의 `test(...)` / `feat|fix(...)` 커밋에 1:1 매핑된다.
+- `One-PR-per-run` / parallel-safety(S9) 모델 변경 없음.
+## 3. 채택 접근 — C (외과적 변경)
+`Stage = PR = run` 모델을 그대로 두고, **프로파일 문구 3곳 + 검증기 S10 1개**만
+바꾼다. 기존 아키텍처·검증기 골격을 보존하면서 두 의도(수직 슬라이스 + 명시적
+RED→GREEN)를 채운다.
+접근 A(검증기 전면 개편)와 B(응집도 위 추가 차원)는 변경 규모/이중화 때문에 기각.
+## 4. 상세 설계
+### 4.1 Delta 1 — 분할 앵커 재정의
+[implementation-planning.md:72](../../../prompts/profiles/implementation-planning.md:72)
+"Cohesion-first partition rule" → **"Vertical-slice-first partition rule"**:
+- 1차 앵커 = **사용자 관찰 가능한 증분 1개를 end-to-end 로 전달하는 얇은 수직
+  슬라이스**. 한 Stage 는 레이어를 가로지르더라도 하나의 기능 증분을 완결한다.
+- 파일 응집도("shared file/module proximity")는 폐기하지 않고 **슬라이스 내부에서
+  step 을 묶는 2차 기준**으로 강등한다.
+- **레이어 가로 절단(horizontal layering) 금지** 문구를 명시한다 — 예: "DB 레이어만
+  한 Stage, 서비스 레이어만 다음 Stage" 식 분할은 거부.
+- 기존 분할 트리거는 유지하되 (c)만 재문구화:
+  - (a) 실제 `depends-on` 데이터/계약 의존성이 있을 때
+  - (b) effective step 이 6을 초과할 때
+  - (c) **별개의 수직 슬라이스(서로 다른 사용자 가치 증분)일 때**
+- "Maximising parallel stages is NOT a reason to split" 원칙은 유지.
+### 4.2 Delta 2 — Stage 당 슬라이스 선언
+[implementation-planning.md:67-74](../../../prompts/profiles/implementation-planning.md:67)
+각 `## 5.5.<i> Stage <i>:` 섹션에 **필수 두 줄**을 추가한다(Carry-In 직전, 헤딩 바로 아래):
+```
+Slice value: <이 Stage 가 전달하는 사용자 관찰 가능한 증분 한 줄>
+Acceptance: <관찰 가능한 통과 조건 또는 정확한 커맨드>
+```
+- `Slice value` 는 "무엇이 동작하게 되는가"를 사용자/소비자 관점으로 기술한다.
+  레이어 이름("repository 추가")이 아니라 증분("X 를 조회하면 Y 가 반환된다").
+- `Acceptance` 는 그 슬라이스가 끝났음을 증명하는 관찰 가능 신호 — 보통 4.3 의
+  RED step 이 PASS 로 전환되는 테스트 커맨드와 동일하다.
+### 4.3 Delta 3 — 계획 단계 RED→GREEN mandatory
+[implementation-planning.md:69](../../../prompts/profiles/implementation-planning.md:69)
+`### Stepwise Execution Order` 요구를 `prefer TDD ordering` → **MUST** 로 격상:
+- 각 비면제 Stage 의 **첫 effective step 의 `action` 셀은 리터럴 `RED:` 로 시작**하고,
+  그 슬라이스의 acceptance 를 포착하는 **실패 테스트**를 기술한다(`expected` = FAIL).
+- 이후 구현 step 중 최소 하나의 `action` 셀은 리터럴 `GREEN:` 로 시작하고 테스트를
+  통과시키는 최소 구현을 기술한다(`expected` = PASS).
+- refactor step 은 선택(있으면 `REFACTOR:` 접두).
+- **면제**: doc-only / config-only / 순수 rename 등 런타임 관찰 동작이 없는 Stage 는
+  섹션에 한 줄 `TDD exemption: <사유>` 를 두고 RED/GREEN 을 생략할 수 있다(executor
+  의 동일 면제 규칙 [_implementation-executor.md:27](../../../prompts/profiles/_implementation-executor.md:27)과 정합).
+리터럴 토큰(`RED:` / `GREEN:` / `REFACTOR:` / `TDD exemption:`)을 쓰는 이유: 검증기가
+"실패 테스트인지 단어로 추론"하는 brittle 방식 대신 substring 으로 확정 검사하기
+위함 — 기존 §"Section heading contract"의 리터럴-substring 철학과 일관
+([implementation-planning.md:54](../../../prompts/profiles/implementation-planning.md:54)).
+### 4.4 Delta 4 — 검증기 S10 (강제)
+[validators/validate-implementation-plan-stages.py](../../../validators/validate-implementation-plan-stages.py)
+에 `_check_slice_tdd()` 를 추가하고 `collect_validation_errors()` 에 연결한다.
+각 Stage 섹션(`_slice_stage_section` 으로 추출)마다:
+- **S10a**: `Slice value:` 라인이 있고 콜론 뒤 값이 비어있지 않음.
+- **S10b**: `Acceptance:` 라인이 있고 콜론 뒤 값이 비어있지 않음.
+- **S10c (TDD ordering)**: 다음 둘 중 하나를 만족.
+  - (i) `Stepwise Execution Order` 의 **첫 effective row 의 action 셀이 `RED:` 로
+    시작** AND 같은 표의 어떤 row 의 action 셀이 `GREEN:` 로 시작, **또는**
+  - (ii) 섹션에 `TDD exemption:` 라인이 존재.
+구현 메모:
+- 첫 effective row 의 action 셀 추출은 기존 `_count_effective_steps` 의 셀 파싱 로직
+  (header/divider skip, `strip("|").split("|")`)을 재사용해 새 헬퍼로 분리한다 — 표 컬럼
+  순서는 `step | action | files | command | expected` 이므로 action 은 index 1.
+- S10 은 S1(Stage Map 부재) 단락 시 실행되지 않으며, stage 파싱 성공 시
+  `_check_each_stage_section` 과 같은 레벨에서 호출한다.
+- 에러 코드 `S10`, stage 번호 포함, 메시지는 누락 항목 명시.
+### 4.5 정합성 — executor / convergence
+- executor 의 per-step TDD loop 은 계획의 RED/GREEN step 을 그대로 실행하면 되므로
+  [_implementation-executor.md](../../../prompts/profiles/_implementation-executor.md) 본문 변경 없음.
+  단, 계획이 이미 RED/GREEN 을 명시하므로 executor 가 "계획의 RED step = 첫 실패
+  테스트"로 읽도록 한 줄 정합 코멘트만 선택적으로 추가 가능(필수 아님).
+- §5.5.9 Plan Body Verification / self-review pass 의 "Stage Map self-check"
+  ([implementation-planning.md:102](../../../prompts/profiles/implementation-planning.md:102))에
+  Slice value/Acceptance/RED-GREEN 확인 항목을 한 줄 추가해 사람-검토와 기계-검토를
+  이중화한다.
+## 5. 영향 / 마이그레이션
+### 5.1 테스트 픽스처 (BLOCKING)
+S10 추가로 기존 valid 픽스처가 새로 실패하므로 **반드시 갱신**한다:
+- [tests/fixtures/plans/valid_one_stage.md](../../../tests/fixtures/plans/valid_one_stage.md):
+  `Slice value:` / `Acceptance:` 두 줄 추가, action 셀을 `RED: ...` / `GREEN: ...` 로 수정.
+- [tests/fixtures/plans/valid_three_stage_parallel.md](../../../tests/fixtures/plans/valid_three_stage_parallel.md):
+  동일 갱신(3 stage 모두).
+- 신규 invalid 픽스처 추가: `invalid_missing_slice_value.md`, `invalid_missing_red_step.md`.
+### 5.2 테스트 코드
+- [tests/test_validate_implementation_plan_stages.py](../../../tests/test_validate_implementation_plan_stages.py)
+  에 S10a/S10b/S10c 통과·실패 케이스, `TDD exemption` 면제 통과 케이스 추가.
+- `tests/test_render_final_report.py` / 골든 리포트가 stage 본문을 포함한다면 동반 갱신.
+### 5.3 빌드 / 동기화
+- 소스(`prompts/`, `validators/`) 수정 후 `npm run build` 로 `runtime/` 동기화.
+- `runtime/` 직접 수정 금지.
+### 5.4 사전 확인 (가정 금지)
+- 구현 착수 시 `report-writer` 가 stage 본문을 실제로 어디서 emit 하는지 1회 실측
+  확인한다(프로파일 가이던스 vs 템플릿). `valid_one_stage.md` 픽스처가 멀티-Stage
+  구조를 보이므로 프로파일 가이던스 경로가 유력하나, 코드로 확인 후 진행한다.
+## 6. 수용 기준 (이 설계의 done 조건)
+1. 프로파일 분할 앵커가 vertical-slice-first 로 재정의되고 horizontal 금지 명시.
+2. 각 Stage 가 `Slice value:` / `Acceptance:` 를 선언하도록 프로파일이 요구.
+3. 계획 `Stepwise Execution Order` 가 `RED:` 첫 step + `GREEN:` 을 MUST 로 요구.
+4. `validate-implementation-plan-stages.py` S10 이 위 1–3 의 산출물을 강제하고
+   `python3 -m pytest tests/test_validate_implementation_plan_stages.py` 가 통과.
+5. 기존 valid 픽스처 갱신으로 전체 stages 검증 스위트 green.
+6. `npm run build` 후 `runtime/` 의 프로파일·검증기가 소스와 일치.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "okstra",
-  "version": "0.51.0",
+  "version": "0.52.0",
   "description": "Multi-agent cross-verification orchestrator runtime + Claude Code skills.",
   "license": "MIT",
   "author": "devonshin",

package/runtime/BUILD.json CHANGED Viewed

@@ -1,5 +1,5 @@
 {
-  "package": "0.51.0",
-  "builtAt": "2026-06-05T15:09:03.772Z",
+  "package": "0.52.0",
+  "builtAt": "2026-06-05T15:37:23.478Z",
   "repoRoot": "/home/runner/work/okstra/okstra"
 }

package/runtime/prompts/profiles/implementation-planning.md CHANGED Viewed

@@ -55,7 +55,7 @@
   - The final report MUST include section headings containing each of the following exact strings: `Option Candidates`, `Trade-off`, `Recommended Option`, `Stage Map`, `Stage Exit Contract`, `Stage Validation`, `Dependency`, `Validation Checklist`, `Rollback`. (Approval is no longer a body section — it is the YAML frontmatter `approved` field.)
   - Korean translations are allowed in parentheses (e.g. `### Recommended Option (권장 옵션)`), but the English keyword must be present verbatim in the heading line.
   - The shape and ordering follow `final-report-template.md` section 4.5 (`Implementation Plan Deliverables`). Do NOT translate the heading keywords — `validators/validate-run.py` does substring matching on the raw report text and 7-of-8 missing strings is a real, repeatedly observed failure mode (root cause: writer translated the headings to Korean).
-  - Beyond substring matching, when the Plan Body Verification gate result is `passed` / `passed-with-dissent`, `validators/validate-run.py` runs the **structural** Stage Map validator (`validators/validate-implementation-plan-stages.py`) at the planning boundary — the exact `## 5.5 Stage Map` heading, each `## 5.5.<i> Stage <i>:` section with its four required subsections, the per-stage effective step count (≤6), and the `depends-on` DAG are all enforced here, not deferred to the `implementation` entry gate.
+  - Beyond substring matching, when the Plan Body Verification gate result is `passed` / `passed-with-dissent`, `validators/validate-run.py` runs the **structural** Stage Map validator (`validators/validate-implementation-plan-stages.py`) at the planning boundary — the exact `## 5.5 Stage Map` heading, each `## 5.5.<i> Stage <i>:` section with its four required subsections, the per-stage effective step count (≤6), the `depends-on` DAG, and the per-stage vertical-slice contract (S10) are all enforced here, not deferred to the `implementation` entry gate. S10 scans for the literal in-section strings `Slice value:`, `Acceptance:`, and the Stepwise `action`-cell prefixes `RED:` / `GREEN:` (or a `TDD exemption:` line) — keep these tokens verbatim for the same reason as the heading keywords above.
 - Required deliverable shape (final report, in addition to the standard sections):
   - at least two implementation options. **Each option must include**:
     - **File Structure**: an explicit list of files to create / modify / delete with each file's responsibility (one-line each). Use the form `Create: path — responsibility` / `Modify: path:line-range — change summary` / `Delete: path — reason`.
@@ -64,12 +64,15 @@
   - trade-off matrix across options (rows = options, columns at minimum: complexity, risk, reversibility, test coverage cost, rollout cost)
   - recommended option with rationale tied to the design principles above
   - **Stage Map (mandatory — always emitted, even when N=1):** a table of all stages with `stage | title | depends-on | step-count | exit-contract-summary`. `depends-on` is `(none)` or a comma-separated stage number list. Stages with `depends-on (none)` can be implemented in parallel by two simultaneous `implementation` runs.
+  - **Per-stage slice declaration (mandatory two lines, directly under the `## 5.5.<i> Stage <i>:` heading, before `### Carry-In`):**
+    - `Slice value: <the one user-observable increment this stage delivers, end-to-end>` — describe WHAT starts working from the consumer's view (e.g. "X 를 조회하면 Y 가 반환된다"), NOT a layer name ("repository 추가"). Validator S10a rejects a missing/empty value.
+    - `Acceptance: <the observable pass condition or the exact command>` — the signal that proves the slice is done; normally the same test command that the `RED:` step below flips to PASS. Validator S10b rejects a missing/empty value.
   - **Per-stage subsections** (`## 5.5.<i> Stage <i>: <title>` for each `i`), each containing the four required subsections:
     - `### Carry-In` — for `depends-on (none)`: task-brief only. Otherwise: each depended-on stage's static exit contract + runtime sidecar path `runs/<impl-key>/carry/stage-<i>.json` placeholder.
-    - `### Stepwise Execution Order` — bite-sized table with `step | action | files | command | expected`. **Effective row count ≤ 6** (excluding header / divider / blank). Each step is one action completable in 2–5 minutes; for code steps include actual code or diff sketch; prefer TDD ordering (failing test → implementation → green → commit).
+    - `### Stepwise Execution Order` — bite-sized table with `step | action | files | command | expected`. **Effective row count ≤ 6** (excluding header / divider / blank). Each step is one action completable in 2–5 minutes; for code steps include actual code or diff sketch. **TDD ordering is MUST, not a preference:** the **first** effective step's `action` cell MUST start with the literal `RED:` and describe the failing test that captures this stage's `Acceptance` (`expected` = FAIL); at least one later `action` cell MUST start with the literal `GREEN:` and describe the minimal implementation that makes it pass (`expected` = PASS); an optional refactor step starts with `REFACTOR:`. **Exemption:** doc-only / config-only / pure-rename stages with no observable runtime behaviour may omit RED/GREEN by declaring one line `TDD exemption: <reason>` in the stage section (mirrors the executor's per-step exemption in `_implementation-executor.md`). Validator S10c enforces RED-first + GREEN, or the exemption line.
     - `### Stage Exit Contract` — predicted added/modified files, newly exposed identifiers/types/endpoints, downstream-usable resources.
     - `### Stage Validation` — pre / mid / post exact commands or observable outcomes for this stage only.
-  - **Cohesion-first partition rule (1st-class):** the grouping anchor is **shared file/module proximity** — steps touching the same file/directory/module go in the same stage so the diff, PR, and rollback unit are semantically cohesive. A stage is split ONLY when (a) a real `depends-on` data/contract dependency exists, (b) effective steps would exceed 6, or (c) the file sets are disjoint (unrelated work touching no shared file is not crammed together). Maximising the number of parallel stages is NOT a reason to split — parallelism is an emergent property of independent stages, never a partitioning goal.
+  - **Vertical-slice-first partition rule (1st-class):** the grouping anchor is a **thin end-to-end vertical slice** — one stage delivers a single user-observable increment, crossing whatever layers are needed (data → service → API → UI) to make that one increment work. File/module proximity is demoted to the **intra-slice grouping rule**: within a slice, keep steps touching the same file/directory/module together so the diff, PR, and rollback unit stay cohesive. **Horizontal layer-splitting is forbidden** — never carve "the DB layer" into one stage and "the service layer" into the next; that produces stages that ship no standalone user value. A stage is split ONLY when (a) a real `depends-on` data/contract dependency exists, (b) effective steps would exceed 6, or (c) it is a distinct vertical slice (a different user-value increment). Maximising the number of parallel stages is NOT a reason to split — parallelism is an emergent property of independent stages, never a partitioning goal.
   - **Parallel-safety invariant (BLOCKING):** any two stages that are both `depends-on (none)` MUST predict disjoint file sets in their `Stage Exit Contract`. Two parallel `implementation` runs would otherwise edit the same file concurrently. Work touching a shared file must either go in one stage or be ordered with `depends-on`. Enforced by `validators/validate-implementation-plan-stages.py` check S9.
   - **Stage exit contract is the carry surface:** keep it as narrow as possible. Wider surface = more downstream coupling.
   - dependency / migration risk assessment (ordering constraints, data backfills, feature-flag prerequisites, repo-internal sequencing)
@@ -99,4 +102,4 @@
   4. **Ambiguity check** — any requirement that could be read two ways must be made explicit or moved to the `## 1. Clarification Items` table as a `Blocks=approval` row.
   5. **Scope check** — if the recommended plan now spans multiple independent subsystems, recommend splitting into separate planning runs rather than shipping an oversized plan.
   6. **Plan-body verification reconciliation (BLOCKING for implementation-planning).** Inspect the `### 5.5.9 Plan Body Verification` verdict table. For every plan-item row classified as `majority-disagree → C-<N>`, the corresponding `C-<N>` row MUST exist in `## 1. Clarification Items` with `Kind` chosen per the standard policy and `Blocks=approval`. Do NOT create a parallel `### 5.5.x Open Questions` block — the unified table is the single home. Conversely, the `Classification` column's `C-<N>` reference and the `## 1. Clarification Items` `ID` column MUST match 1:1; an orphan on either side is a contract violation. For `partial-consensus` and `worker-unique` plan-items, the dissenting opinion lives in §5.5.9 `Dissent log` and is NOT promoted to §5.
-  7. **Stage Map self-check** — for every stage, count the effective rows of its `Stepwise Execution Order` table by hand; reject the draft if any stage exceeds 6. Walk the `depends-on` graph and confirm it is a DAG (no cycle, no self-reference). For each `depends-on` link, confirm it encodes a real data/contract dependency — do NOT add links to serialise unrelated work, and do NOT split a stage merely to create more parallel stages. **Parallel-safety:** for every pair of `depends-on (none)` stages, confirm their `Stage Exit Contract` predicted file sets are disjoint; if they share a file, merge them or add a `depends-on` link (validator S9 rejects overlap).
+  7. **Stage Map self-check** — for every stage, count the effective rows of its `Stepwise Execution Order` table by hand; reject the draft if any stage exceeds 6. Confirm each stage declares a non-empty `Slice value:` and `Acceptance:` line, and that its first step `action` starts with `RED:` with a later `GREEN:` (or carries a `TDD exemption:` line) — this is what validator S10 enforces. Walk the `depends-on` graph and confirm it is a DAG (no cycle, no self-reference). For each `depends-on` link, confirm it encodes a real data/contract dependency — do NOT add links to serialise unrelated work, and do NOT split a stage merely to create more parallel stages. **Parallel-safety:** for every pair of `depends-on (none)` stages, confirm their `Stage Exit Contract` predicted file sets are disjoint; if they share a file, merge them or add a `depends-on` link (validator S9 rejects overlap).

package/runtime/skills/okstra-memory/SKILL.md CHANGED Viewed

@@ -44,6 +44,26 @@ If `okstra` is not on PATH, tell the user:
 Do not use `npx` from this skill.
+## Step 1: Pick the project-group (always first)
+Every Memory Book entry belongs to a **project-group** — a search-scoping label
+that partitions memory by organization or context (e.g. `acme`, `globex`,
+`private` for personal notes). This selection comes **before** storing or
+searching so the rest of the skill can scope to it.
+1. Enumerate existing groups to build recommendations:
+   ```bash
+   okstra memory groups --json
+   ```
+2. Present a 3-option picker (most-used existing group, next existing group,
+   then always `직접 입력`). If `groups` is empty, recommend `private` as the
+   first option. Use `private` for the user's personal context.
+3. Carry the chosen group name into every `add` (`--project-group <name>`) and
+   into scoped `search`/`list` (`--project-group <name>`) below. Omit the flag
+   only when the user explicitly wants a cross-group search.
 ## Store current conversation
 1. Extract only durable memory from the conversation:
@@ -65,19 +85,22 @@ Do not use `npx` from this skill.
 Command shape:
 ```bash
-okstra memory add --content "<summary markdown>" --title "<short title>" --type <type> --tag <tag> --project <id> --source conversation --yes
+okstra memory add --content "<summary markdown>" --title "<short title>" --type <type> --project-group <group> --tag <tag> --project <id> --source conversation --yes
 ```
 Use repeated `--tag` / `--project` flags when needed. Omit `--project` when no
-project is clearly related.
+project is clearly related. `--project-group` is the group chosen in Step 1
+(default `global` if the user truly wants it ungrouped).
 ## Search / read / archive
-Use:
+Scope reads to the chosen project-group by default; drop `--project-group` only
+for an explicit cross-group search:
 ```bash
-okstra memory search "<query>"
-okstra memory list --tag "<tag>"
+okstra memory search "<query>" --project-group "<group>"
+okstra memory list --project-group "<group>" --tag "<tag>"
+okstra memory groups            # list groups with entry counts
 okstra memory show "<memory-id>"
 okstra memory archive "<memory-id>"
 ```

package/runtime/validators/validate-implementation-plan-stages.py CHANGED Viewed

@@ -1,5 +1,5 @@
 #!/usr/bin/env python3
-"""S1–S9 checks for the Stage Map structure of an approved
+"""S1–S10 checks for the Stage Map structure of an approved
 implementation-planning final-report.md. Run from prepare_task_bundle
 of `implementation` task or standalone."""
@@ -40,7 +40,7 @@ class StageMeta:
 @dataclass
 class ValidationError:
-    code: str   # S1..S9
+    code: str   # S1..S10
     stage: int  # 0 = global
     message: str
@@ -104,30 +104,36 @@ def _slice_stage_section(text: str, stage_number: int) -> str:
     return text[start: start + nxt.start()] if nxt else text[start:]
-def _count_effective_steps(section: str) -> int:
+def _effective_step_rows(section: str) -> List[List[str]]:
+    """Effective (non header/divider/comment) rows of the `### Stepwise
+    Execution Order` table, each as a list of stripped cells. Columns are
+    `step | action | files | command | expected`, so action is index 1."""
     m = re.search(r"^###\s+Stepwise Execution Order\b", section, re.M)
     if not m:
-        return 0
+        return []
     body = section[m.end():]
     nxt = re.search(r"^###\s+\w", body, re.M)
     if nxt:
         body = body[: nxt.start()]
-    count = 0
+    rows: List[List[str]] = []
     for line in body.splitlines():
         s = line.strip()
         if not s or s.startswith("<!--"):
             continue
         if not s.startswith("|"):
             continue
-        # Reuse the same header/divider detection as _parse_stage_map:
-        # split on `|`, inspect first non-empty cell.
-        first_cell = s.strip("|").split("|")[0].strip()
+        cells = [c.strip() for c in s.strip("|").split("|")]
+        first_cell = cells[0]
         if first_cell.lower() == "step":
             continue
         if set(first_cell) <= set("-: "):
             continue
-        count += 1
-    return count
+        rows.append(cells)
+    return rows
+def _count_effective_steps(section: str) -> int:
+    return len(_effective_step_rows(section))
 def _check_each_stage_section(text: str, stages: List[StageMeta]) -> List[ValidationError]:
@@ -159,6 +165,45 @@ def _check_each_stage_section(text: str, stages: List[StageMeta]) -> List[Valida
     return errs
+SLICE_VALUE = re.compile(r"^\s*Slice value\s*:\s*(.+?)\s*$", re.M)
+ACCEPTANCE = re.compile(r"^\s*Acceptance\s*:\s*(.+?)\s*$", re.M)
+TDD_EXEMPTION = re.compile(r"^\s*TDD exemption\s*:\s*\S", re.M)
+def _check_slice_tdd(text: str, stages: List[StageMeta]) -> List[ValidationError]:
+    """S10: each stage declares a vertical slice and follows RED→GREEN ordering.
+    S10a — `Slice value:` line with a non-empty value.
+    S10b — `Acceptance:` line with a non-empty value.
+    S10c — first effective Stepwise step's action starts with `RED:` AND some
+           action starts with `GREEN:`, OR a `TDD exemption:` line is present.
+    """
+    errs: List[ValidationError] = []
+    for s in stages:
+        section = _slice_stage_section(text, s.stage_number)
+        if not section:
+            continue   # S3 already reported the missing section
+        if not SLICE_VALUE.search(section):
+            errs.append(ValidationError("S10", s.stage_number,
+                "S10a: 'Slice value:' line missing or empty"))
+        if not ACCEPTANCE.search(section):
+            errs.append(ValidationError("S10", s.stage_number,
+                "S10b: 'Acceptance:' line missing or empty"))
+        if TDD_EXEMPTION.search(section):
+            continue
+        rows = _effective_step_rows(section)
+        actions = [r[1] for r in rows if len(r) > 1]
+        first_is_red = bool(actions) and actions[0].startswith("RED:")
+        has_green = any(a.startswith("GREEN:") for a in actions)
+        if not (first_is_red and has_green):
+            errs.append(ValidationError("S10", s.stage_number,
+                "S10c: first step action must start with 'RED:' and some "
+                "step action with 'GREEN:', or add a 'TDD exemption:' line"))
+    return errs
 def _check_depends_on(stages: List[StageMeta]) -> List[ValidationError]:
     errs: List[ValidationError] = []
     valid = {s.stage_number for s in stages}
@@ -229,7 +274,7 @@ def _check_parallel_safety(text: str, stages: List[StageMeta]) -> List[Validatio
 def collect_validation_errors(text: str) -> List[ValidationError]:
-    """All S1–S9 checks against the report text; empty list means valid.
+    """All S1–S10 checks against the report text; empty list means valid.
     S1 (missing `## 5.5 Stage Map` heading) makes the rest unparseable, so it
     short-circuits. Shared by `main()` (CLI / implementation entry) and the
@@ -244,6 +289,7 @@ def collect_validation_errors(text: str) -> List[ValidationError]:
     errors.extend(s2_errs)
     if stages:
         errors.extend(_check_each_stage_section(text, stages))
+        errors.extend(_check_slice_tdd(text, stages))
         errors.extend(_check_depends_on(stages))
         errors.extend(_check_parallel_safety(text, stages))
     return errors

package/src/memory.mjs CHANGED Viewed

@@ -46,8 +46,11 @@ It is separate from project-local .okstra task artifacts.
 Usage:
   okstra memory add [--content <text> | --file <path>] [options]
-  okstra memory list [--limit <n>] [--tag <tag>] [--type <type>] [--json]
-  okstra memory search <query> [--limit <n>] [--include-archived] [--json]
+  okstra memory list [--limit <n>] [--tag <tag>] [--type <type>]
+                     [--project-group <name>] [--json]
+  okstra memory search <query> [--limit <n>] [--project-group <name>]
+                       [--include-archived] [--json]
+  okstra memory groups [--include-archived] [--json]
   okstra memory show <id> [--json]
   okstra memory archive <id> [--json]
@@ -55,7 +58,7 @@ Add options:
   --title <title>       Entry title. Defaults to the first non-empty line.
   --type <type>         context|decision|preference|requirement|person|
                         project-hint|follow-up. Default: context.
-  --scope <scope>       Free-form scope label. Default: global.
+  --project-group <name> Project group label (e.g. acme, private). Default: global.
   --project <id>        Related project id. Repeatable.
   --tag <tag>           Tag. Repeatable or comma-separated.
   --source <source>     Source label. Default: conversation.
@@ -97,7 +100,7 @@ function parseAddArgs(args) {
     file: null,
     title: null,
     type: "context",
-    scope: "global",
+    projectGroup: "global",
     source: "conversation",
     tags: [],
     projects: [],
@@ -110,7 +113,7 @@ function parseAddArgs(args) {
     else if (flag === "--file") opts.file = takeValue(args, i++, flag);
     else if (flag === "--title") opts.title = takeValue(args, i++, flag);
     else if (flag === "--type") opts.type = takeValue(args, i++, flag);
-    else if (flag === "--scope") opts.scope = takeValue(args, i++, flag);
+    else if (flag === "--project-group") opts.projectGroup = takeValue(args, i++, flag);
     else if (flag === "--source") opts.source = takeValue(args, i++, flag);
     else if (flag === "--project") opts.projects.push(takeValue(args, i++, flag));
     else if (flag === "--tag") opts.tags.push(...splitCsv(takeValue(args, i++, flag)));
@@ -128,24 +131,26 @@ function parseAddArgs(args) {
 }
 function parseListArgs(args) {
-  const opts = { ...parseGlobalFlags(args), limit: 20, tag: null, type: null };
+  const opts = { ...parseGlobalFlags(args), limit: 20, tag: null, type: null, projectGroup: null };
   for (let i = 0; i < args.length; i++) {
     const flag = args[i];
     if (flag === "--json" || flag === "--include-archived") continue;
     if (flag === "--limit") opts.limit = parseLimit(takeValue(args, i++, flag));
     else if (flag === "--tag") opts.tag = takeValue(args, i++, flag);
     else if (flag === "--type") opts.type = takeValue(args, i++, flag);
+    else if (flag === "--project-group") opts.projectGroup = takeValue(args, i++, flag);
     else throw new Error(`unknown flag ${flag}`);
   }
   return opts;
 }
 function parseQueryArgs(args) {
-  const opts = { ...parseGlobalFlags(args), limit: 20, query: [] };
+  const opts = { ...parseGlobalFlags(args), limit: 20, query: [], projectGroup: null };
   for (let i = 0; i < args.length; i++) {
     const flag = args[i];
     if (flag === "--json" || flag === "--include-archived") continue;
     if (flag === "--limit") opts.limit = parseLimit(takeValue(args, i++, flag));
+    else if (flag === "--project-group") opts.projectGroup = takeValue(args, i++, flag);
     else if (flag.startsWith("--")) throw new Error(`unknown flag ${flag}`);
     else opts.query.push(flag);
   }
@@ -153,6 +158,19 @@ function parseQueryArgs(args) {
   return { ...opts, query: opts.query.join(" ").trim() };
 }
+function parseGroupsArgs(args) {
+  for (const flag of args) {
+    if (flag !== "--json" && flag !== "--include-archived") {
+      throw new Error(`unknown flag ${flag}`);
+    }
+  }
+  return parseGlobalFlags(args);
+}
+function entryGroup(entry) {
+  return entry.projectGroup ?? "global";
+}
 function parseLimit(raw) {
   const value = Number.parseInt(raw, 10);
   if (!Number.isInteger(value) || value < 1) {
@@ -220,7 +238,7 @@ function buildEntry(opts, content, now) {
     id,
     title,
     type: opts.type,
-    scope: opts.scope,
+    projectGroup: opts.projectGroup,
     source: opts.source,
     tags: [...new Set(opts.tags)],
     relatedProjects: [...new Set(opts.projects)],
@@ -237,7 +255,7 @@ function renderMarkdown(entry, content) {
     `id: ${entry.id}`,
     `createdAt: ${entry.createdAt}`,
     `source: ${entry.source}`,
-    `scope: ${entry.scope}`,
+    `project-group: ${entry.projectGroup}`,
     `type: ${entry.type}`,
     `status: ${entry.status}`,
     `relatedProjects: [${entry.relatedProjects.join(", ")}]`,
@@ -258,6 +276,7 @@ async function confirmSave(entry, content, opts) {
   process.stdout.write(`Memory Book entry:\n`);
   process.stdout.write(`  title: ${entry.title}\n`);
   process.stdout.write(`  type:  ${entry.type}\n`);
+  process.stdout.write(`  group: ${entry.projectGroup}\n`);
   process.stdout.write(`  tags:  ${entry.tags.join(", ") || "(none)"}\n`);
   process.stdout.write(`  text:  ${truncate(content.trim().replace(/\s+/g, " "), 140)}\n`);
   const rl = createInterface({ input, output });
@@ -341,6 +360,7 @@ async function opList(args) {
   const entries = visibleEntries(await readIndex(), opts)
     .filter((entry) => !opts.tag || entry.tags.includes(opts.tag))
     .filter((entry) => !opts.type || entry.type === opts.type)
+    .filter((entry) => !opts.projectGroup || entryGroup(entry) === opts.projectGroup)
     .slice(0, opts.limit);
   if (opts.json) emitJson(entries);
   else process.stdout.write(entries.map(formatEntryLine).join("\n") + (entries.length ? "\n" : ""));
@@ -350,7 +370,9 @@ async function opList(args) {
 async function opSearch(args) {
   const opts = parseQueryArgs(args);
   const needle = opts.query.toLowerCase();
-  const entries = visibleEntries(await readIndex(), opts);
+  const entries = visibleEntries(await readIndex(), opts).filter(
+    (entry) => !opts.projectGroup || entryGroup(entry) === opts.projectGroup,
+  );
   const matches = [];
   for (const entry of entries) {
     if (await entryMatches(entry, needle)) matches.push(entry);
@@ -361,12 +383,27 @@ async function opSearch(args) {
   return 0;
 }
+async function opGroups(args) {
+  const opts = parseGroupsArgs(args);
+  const counts = new Map();
+  for (const entry of visibleEntries(await readIndex(), opts)) {
+    const group = entryGroup(entry);
+    counts.set(group, (counts.get(group) ?? 0) + 1);
+  }
+  const groups = [...counts.entries()]
+    .map(([name, count]) => ({ name, count }))
+    .sort((a, b) => b.count - a.count || a.name.localeCompare(b.name));
+  if (opts.json) emitJson(groups);
+  else process.stdout.write(groups.map((g) => `${g.name}  (${g.count})`).join("\n") + (groups.length ? "\n" : ""));
+  return 0;
+}
 async function entryMatches(entry, needle) {
   const haystack = [
     entry.id,
     entry.title,
     entry.type,
-    entry.scope,
+    entryGroup(entry),
     entry.source,
     ...entry.tags,
     ...entry.relatedProjects,
@@ -446,6 +483,8 @@ export async function run(args) {
         return await opList(rest);
       case "search":
         return await opSearch(rest);
+      case "groups":
+        return await opGroups(rest);
       case "show":
         return await opShow(rest);
       case "archive":