okstra 0.45.0 → 0.46.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -27,10 +27,9 @@ until Phase 5 ends, then drop from active context for Phase 6/7.
27
27
  - Doc-only / config-only / pure-rename steps that have no observable runtime behaviour are exempt from the failing-test requirement, but the executor MUST cite the exemption per step in the final report (`TDD exemption: <reason>`).
28
28
  - When the touched area has no existing test harness, the executor MUST stand up the minimum harness needed to host one regression test for this run rather than skipping TDD entirely. Record the harness-bootstrap step as an `Out-of-plan edit` if it is not in the plan.
29
29
  - **DB / IO / SQL changes require real execution — mock-only is NOT validation evidence:** when this run's diff touches DB/IO/SQL (ORM / query-builder code — sequelize / typeorm / prisma / knex / raw SQL — `*.repository.*`, model/entity files, `migrations/**`, `*.sql`, or any changed query string), a mocked unit test cannot observe the SQL the query builder actually emits — a mocked suite once passed while `count({ col: 'FontFamily.fontFamily' })` threw `Unknown column` on the real DB. The executor MUST run the change against a real (or faithful-replica) datastore — the `db-test` validation step (plan `validation` db step, else `project.json.qaCommands.db-test`), targeting a **local / replica** DB — and cite its exact command + exit code in the final report's `Validation evidence`. If no real DB / `db-test` command is reachable, do NOT claim the change verified: label the DB portion `정적 분석상 …, 미검증(실행 안 함)` in the report, surface it in the routing recommendation, and never downplay the real run as "too heavy". `git push` stays forbidden (universal list); the unverified DB state is carried forward so `final-verification` cannot accept it and `release-handoff` cannot push.
30
- - re-read the approved plan end-to-end and parse the `## 4.5 Stage Map`. Determine **start stage**:
31
- - if `--stage <N>` is supplied, use N. Otherwise auto = the lowest stage number whose `depends-on` are all recorded as `status:done` in `runs/<plan-key>/consumers.jsonl` AND that itself has no `status:done` row. Multiple stages may match two parallel `implementation` runs may pick different ones and proceed concurrently.
32
- - load every `runs/<plan-key>/carry/stage-<i>.json` for `i depends-on(start_stage)` and inject them into the executor's working context as "runtime carry-in". For `depends-on (none)` stages, no sidecar load task-brief only.
33
- - extract the **start stage's** file list, step order, Stage Validation commands, Stage Exit Contract, and rollback path. These — not the whole plan — are the authoritative scope for this run.
30
+ - re-read the approved plan end-to-end and parse the `## 4.5 Stage Map`. Read the **Stage batch** injected in the launch prompt (`Stage batch for this implementation run`): it lists the stage numbers this run owns, ascending. The runtime already selected and reserved this batch — do NOT recompute the start stage from `consumers.jsonl`.
31
+ - for each stage in the batch, load every `runs/<plan-key>/carry/stage-<i>.json` for `i depends-on(stage)` and inject them into the executor's working context as "runtime carry-in". For `depends-on (none)` stages, no sidecar load task-brief only.
32
+ - the batch's stages are mutually independent (each one's `depends-on` are all already `status:done`, never another batch member), so execute them in ascending order; each stage's file list, step order, Stage Validation commands, Stage Exit Contract, and rollback path are the authoritative scope for that stage.
34
33
  - inspect the current state of every file the plan names; if any file has changed materially since the plan was written, stop and route to a new `implementation-planning` run instead of editing speculatively
35
34
  - "materially changed" means: the function, class, section, or behaviour the plan targets has been edited, renamed, moved, removed, or otherwise altered in a way that invalidates the plan's reasoning. Cosmetic edits (whitespace, comment-only changes, unrelated function modifications elsewhere in the same file) do NOT trigger a re-plan; cite the diff (`git log --oneline <plan-created-at>..HEAD -- <file>`) in the final report and proceed.
36
35
  - distinguish the two file-scope rules (they are not in conflict):
@@ -38,15 +37,14 @@ until Phase 5 ends, then drop from active context for Phase 6/7.
38
37
  - **out-of-plan rule** (Allowed actions section below): if a step *requires touching a file NOT in the plan list*, that is permitted with `Out-of-plan edits` justification. This handles honest scope discovery during execution.
39
38
  - confirm the test/build commands referenced in the plan still exist and run from a clean state
40
39
 
41
- ## Stage execution contract (this run owns exactly one stage of the plan)
40
+ ## Stage execution contract (this run owns the injected stage batch)
42
41
 
43
- - **Sidecar evidence writer (BLOCKING).** When the start stage's Stage Validation `post` commands all succeed, the Executor MUST emit a JSON object matching the schema in `docs/superpowers/specs/2026-05-20-implementation-planning-multi-stage-design.md` §3.2 and the lead MUST persist it to `runs/<impl-task-key>/carry/stage-<N>.json`. The file MUST NOT exist before the run starts (overwrite is refused — see `--force-stage` non-goal).
44
- - **Reverse link (BLOCKING).** Before the first Edit/Write, append a `status:"started"` row to `runs/<plan-task-key>/consumers.jsonl` (lock via the okstra runtime). On stage completion, append a `status:"done"` row with `carry_path` populated.
45
- - **One-PR-per-stage.** This run creates exactly one PR titled `Stage <N>: <stage title>`. The PR body MUST include:
46
- - `## Stage` — number and title (from Stage Map row).
47
- - `## Carry-In summary` — depends-on list + cited identifiers/SHAs from each loaded sidecar (omit when depends-on is empty).
48
- - `## Next stage` — next stage number/title or `(last stage)`.
49
- Stage PRs link back to each other in their bodies (`Previous: #<n>, Next: #<m>` lines) so a reviewer can navigate the chain.
42
+ - **Sidecar evidence writer (BLOCKING, per stage).** For each stage in the batch, when that stage's Stage Validation `post` commands all succeed, the Executor MUST emit a JSON object matching the schema in `docs/superpowers/specs/2026-05-20-implementation-planning-multi-stage-design.md` §3.2 and the lead MUST persist it to `runs/<impl-task-key>/carry/stage-<N>.json`. Each file MUST NOT exist before the run starts (overwrite is refused — see `--force-stage` non-goal).
43
+ - **Reverse link (BLOCKING, per stage).** The runtime already appended a `status:"started"` row per batch stage before this run began. On each stage's completion, append a `status:"done"` row with `carry_path` populated for that stage number.
44
+ - **One-PR-per-run.** This run creates exactly one PR titled `Stages <first>–<last>: <run summary>` (or `Stage <N>: <title>` when the batch is a single stage). The PR body MUST include:
45
+ - `## Stage <N>` one section per batched stage: number, title (from Stage Map row), touched files, and validation result.
46
+ - `## Carry-In summary` — per stage, depends-on list + cited identifiers/SHAs from each loaded sidecar (omit when depends-on is empty).
47
+ - `## Previous run` / `## Next run` — links so a reviewer can navigate the run chain.
50
48
 
51
49
  ## Allowed actions during the run
52
50
 
@@ -65,7 +65,8 @@
65
65
  - `### Stepwise Execution Order` — bite-sized table with `step | action | files | command | expected`. **Effective row count ≤ 6** (excluding header / divider / blank). Each step is one action completable in 2–5 minutes; for code steps include actual code or diff sketch; prefer TDD ordering (failing test → implementation → green → commit).
66
66
  - `### Stage Exit Contract` — predicted added/modified files, newly exposed identifiers/types/endpoints, downstream-usable resources.
67
67
  - `### Stage Validation` — pre / mid / post exact commands or observable outcomes for this stage only.
68
- - **Parallelisation-first rule (1st-class):** the writer MUST prefer the partition that maximises the number of `depends-on (none)` stages. Given two partitions with equal total step count, the one with fewer `depends-on` edges wins. Conservative `let's serialise to be safe` groupings are forbidden each `depends-on` link is justified by a concrete data/contract dependency, not a vague risk concern.
68
+ - **Cohesion-first partition rule (1st-class):** the grouping anchor is **shared file/module proximity** — steps touching the same file/directory/module go in the same stage so the diff, PR, and rollback unit are semantically cohesive. A stage is split ONLY when (a) a real `depends-on` data/contract dependency exists, (b) effective steps would exceed 6, or (c) the file sets are disjoint (unrelated work touching no shared file is not crammed together). Maximising the number of parallel stages is NOT a reason to split — parallelism is an emergent property of independent stages, never a partitioning goal.
69
+ - **Parallel-safety invariant (BLOCKING):** any two stages that are both `depends-on (none)` MUST predict disjoint file sets in their `Stage Exit Contract`. Two parallel `implementation` runs would otherwise edit the same file concurrently. Work touching a shared file must either go in one stage or be ordered with `depends-on`. Enforced by `validators/validate-implementation-plan-stages.py` check S9.
69
70
  - **Stage exit contract is the carry surface:** keep it as narrow as possible. Wider surface = more downstream coupling.
70
71
  - dependency / migration risk assessment (ordering constraints, data backfills, feature-flag prerequisites, repo-internal sequencing)
71
72
  - validation checklist (pre / mid / post) — each item is an exact command or observable outcome
@@ -93,4 +94,4 @@
93
94
  4. **Ambiguity check** — any requirement that could be read two ways must be made explicit or moved to the `## 5. Clarification Items` table as a `Blocks=approval` row.
94
95
  5. **Scope check** — if the recommended plan now spans multiple independent subsystems, recommend splitting into separate planning runs rather than shipping an oversized plan.
95
96
  6. **Plan-body verification reconciliation (BLOCKING for implementation-planning).** Inspect the `### 4.5.9 Plan Body Verification` verdict table. For every plan-item row classified as `majority-disagree → C-<N>`, the corresponding `C-<N>` row MUST exist in `## 5. Clarification Items` with `Kind` chosen per the standard policy and `Blocks=approval`. Do NOT create a parallel `### 4.5.x Open Questions` block — the unified table is the single home. Conversely, the `Classification` column's `C-<N>` reference and the `## 5. Clarification Items` `ID` column MUST match 1:1; an orphan on either side is a contract violation. For `partial-consensus` and `worker-unique` plan-items, the dissenting opinion lives in §4.5.9 `Dissent log` and is NOT promoted to §5.
96
- 7. **Stage Map self-check** — for every stage, count the effective rows of its `Stepwise Execution Order` table by hand; reject the draft if any stage exceeds 6. Walk the `depends-on` graph and confirm it is a DAG (no cycle, no self-reference). For each `depends-on` link, ask "can this be removed by re-partitioning?" — if yes, re-partition and re-count.
97
+ 7. **Stage Map self-check** — for every stage, count the effective rows of its `Stepwise Execution Order` table by hand; reject the draft if any stage exceeds 6. Walk the `depends-on` graph and confirm it is a DAG (no cycle, no self-reference). For each `depends-on` link, confirm it encodes a real data/contract dependencydo NOT add links to serialise unrelated work, and do NOT split a stage merely to create more parallel stages. **Parallel-safety:** for every pair of `depends-on (none)` stages, confirm their `Stage Exit Contract` predicted file sets are disjoint; if they share a file, merge them or add a `depends-on` link (validator S9 rejects overlap).
@@ -1,6 +1,7 @@
1
1
  # Implementation Profile
2
2
 
3
3
  - Purpose: realise the approved `implementation-planning` deliverable as actual source changes, with cross-model verification, while keeping the run reversible
4
+ - **Run-level fixed cost:** the verifier set, Phase 5.5 convergence, and the Phase 6 report-writer run exactly once per run, over the combined diff of all stages in this run's batch — never once per stage.
4
5
  - Required workers:
5
6
  - claude
6
7
  - codex
@@ -229,11 +229,11 @@
229
229
  }
230
230
  },
231
231
  "defaults_or_custom": {
232
- "label": "기본 워커/모델로 진행할까요, 아니면 커스터마이즈할까요?",
232
+ "label": "역할별로 어떤 모델을 쓸지 정하는 단계입니다 (참여 워커 구성을 바꾸는 게 아닙니다).\n· 기본값으로 진행 — lead·실행자/워커·report-writer 를 모두 추천 모델로 두고 바로 진행합니다.\n· 커스터마이즈 — 역할별 모델을 직접 고르고, 추가 directive·관련 task 도 지정합니다.",
233
233
  "echo_template": "customize: {value}",
234
234
  "options": {
235
- "defaults": "Use defaults",
236
- "customize": "Customize"
235
+ "defaults": "기본값으로 진행 (역할별 추천 모델 그대로)",
236
+ "customize": "커스터마이즈 (역할별 모델 직접 선택)"
237
237
  }
238
238
  },
239
239
  "workers_override": {
@@ -1514,11 +1514,11 @@ def inject_lead_prompt_computed_tokens(ctx: dict) -> None:
1514
1514
  def apply_lead_prompt_defaults(ctx: dict) -> None:
1515
1515
  """Apply default values for optional lead-prompt ctx fields.
1516
1516
 
1517
- Sets four optional tokens that the lead prompt template references but
1517
+ Sets the optional tokens that the lead prompt template references but
1518
1518
  which callers may legitimately leave unset (e.g., no validation has run
1519
- yet, no related tasks were declared). Caller-supplied values are
1520
- preserved via `setdefault` / `if-not-in` semantics — this function only
1521
- fills gaps, never overwrites.
1519
+ yet, no related tasks were declared, the run is not an implementation
1520
+ batch). Caller-supplied values are preserved via `setdefault` / `if-not-in`
1521
+ semantics — this function only fills gaps, never overwrites.
1522
1522
 
1523
1523
  Companion to `inject_lead_prompt_computed_tokens` (which always
1524
1524
  overwrites with deterministically-derived values). The two functions
@@ -1528,6 +1528,9 @@ def apply_lead_prompt_defaults(ctx: dict) -> None:
1528
1528
  ctx.setdefault("VALIDATION_STATUS", "not-run")
1529
1529
  ctx.setdefault("RELATED_TASKS_BULLETS", "- None recorded")
1530
1530
  ctx.setdefault("RELATED_TASKS_INLINE", "None")
1531
+ # Empty for non-implementation runs; the implementation prepare path
1532
+ # overwrites it with the resolved stage-batch directive.
1533
+ ctx.setdefault("STAGE_BATCH_DIRECTIVE", "")
1531
1534
  ctx.setdefault(
1532
1535
  "WORKER_PROMPT_PREAMBLE_PATH",
1533
1536
  str(Path.home() / ".okstra" / "templates" / "worker-prompt-preamble.md"),
@@ -208,42 +208,58 @@ def _validate_stage_structure(plan_path: str) -> None:
208
208
  )
209
209
 
210
210
 
211
- def _resolve_effective_stage(
211
+ RUN_STEP_BUDGET = 8
212
+
213
+
214
+ def _resolve_effective_stages(
212
215
  stages: list,
213
216
  done_stages: set,
214
217
  requested: str,
215
- ) -> int:
216
- """Return the stage number to execute.
218
+ budget: int = RUN_STEP_BUDGET,
219
+ ) -> list:
220
+ """Return the ordered list of stage numbers this run executes.
221
+
222
+ `requested` is "auto" or a decimal string. For "auto" the run batches all
223
+ ready stages (depends-on all done, itself not done) in stage-number order up
224
+ to `budget` effective steps — but always at least one. A numeric request is a
225
+ single forced stage. Raises PrepareError on rejection cases."""
226
+ if requested != "auto":
227
+ try:
228
+ n = int(requested)
229
+ except ValueError:
230
+ raise PrepareError(
231
+ f"--stage must be 'auto' or an integer, got {requested!r}"
232
+ )
233
+ target = next((s for s in stages if s["stage_number"] == n), None)
234
+ if target is None:
235
+ raise PrepareError(
236
+ f"--stage {n} not in Stage Map "
237
+ f"(have {[s['stage_number'] for s in stages]})"
238
+ )
239
+ if n in done_stages:
240
+ raise PrepareError(
241
+ f"--stage {n} already completed (consumers.jsonl status:done exists)"
242
+ )
243
+ return [n]
217
244
 
218
- `requested` is either "auto" or a decimal string.
219
- Raises PrepareError on all rejection cases.
220
- """
221
- if requested == "auto":
222
- for s in stages:
223
- if s["stage_number"] in done_stages:
224
- continue
225
- if all(d in done_stages for d in s["depends_on"]):
226
- return s["stage_number"]
245
+ ready = [
246
+ s for s in stages
247
+ if s["stage_number"] not in done_stages
248
+ and all(d in done_stages for d in s["depends_on"])
249
+ ]
250
+ if not ready:
227
251
  raise PrepareError(
228
252
  "no stage is ready: every remaining stage has unsatisfied depends-on"
229
253
  )
230
- try:
231
- n = int(requested)
232
- except ValueError:
233
- raise PrepareError(
234
- f"--stage must be 'auto' or an integer, got {requested!r}"
235
- )
236
- target = next((s for s in stages if s["stage_number"] == n), None)
237
- if target is None:
238
- raise PrepareError(
239
- f"--stage {n} not in Stage Map "
240
- f"(have {[s['stage_number'] for s in stages]})"
241
- )
242
- if n in done_stages:
243
- raise PrepareError(
244
- f"--stage {n} already completed (consumers.jsonl status:done exists)"
245
- )
246
- return n
254
+ batch: list = []
255
+ total = 0
256
+ for s in ready:
257
+ sc = s.get("step_count", 0) or 0
258
+ if batch and total + sc > budget:
259
+ break
260
+ batch.append(s["stage_number"])
261
+ total += sc
262
+ return batch
247
263
 
248
264
 
249
265
  def _parse_stage_map_into_ctx(plan_path: str) -> list:
@@ -842,31 +858,42 @@ def prepare_task_bundle(inp: PrepareInputs) -> PrepareOutputs:
842
858
  })
843
859
  if inp.task_type == "implementation":
844
860
  ctx["parsed_stage_map"] = ctx_stage_map
845
- # Resolve effective stage and append `started` row to consumers.jsonl
861
+ # Resolve the ready-set batch and append a `started` row per batched stage.
846
862
  from .consumers import read_consumers, append_consumer
847
863
  import datetime as _dt
848
864
  plan_run_root = Path(inp.approved_plan_path).resolve().parents[1]
849
865
  consumed = read_consumers(plan_run_root)
850
866
  done_stages = {r["stage"] for r in consumed if r.get("status") == "done"}
851
- effective = _resolve_effective_stage(
867
+ effective = _resolve_effective_stages(
852
868
  ctx["parsed_stage_map"], done_stages, inp.stage
853
869
  )
854
- ctx["effective_stage"] = effective
855
- inp.stage = str(effective)
856
- print(f"selected stage: {inp.stage}", file=sys.stdout)
870
+ ctx["effective_stages"] = effective
871
+ csv = ",".join(str(n) for n in effective)
872
+ ctx["EFFECTIVE_STAGES"] = csv
873
+ ctx["STAGE_BATCH_DIRECTIVE"] = (
874
+ f"- **Stage batch for this implementation run:** `{csv}` "
875
+ "(comma-separated stage numbers, ascending). Execute exactly these "
876
+ "Stage Map stages in this order — this is the authoritative scope. "
877
+ "Do NOT recompute the start stage from `consumers.jsonl`; the runtime "
878
+ "already selected and reserved this batch."
879
+ )
880
+ inp.stage = csv
881
+ print(f"selected stages: {csv}", file=sys.stdout)
857
882
  head_proc = _subprocess.run(
858
883
  ["git", "rev-parse", "HEAD"],
859
884
  cwd=inp.project_root, capture_output=True, text=True,
860
885
  )
861
886
  head_sha = head_proc.stdout.strip() if head_proc.returncode == 0 else ""
862
- append_consumer(
863
- plan_run_root,
864
- impl_task_key=ctx["TASK_KEY"],
865
- stage=effective,
866
- status="started",
867
- started_at=_dt.datetime.now(_dt.timezone.utc).isoformat(),
868
- head_commit=head_sha,
869
- )
887
+ now = _dt.datetime.now(_dt.timezone.utc).isoformat()
888
+ for stage_n in effective:
889
+ append_consumer(
890
+ plan_run_root,
891
+ impl_task_key=ctx["TASK_KEY"],
892
+ stage=stage_n,
893
+ status="started",
894
+ started_at=now,
895
+ head_commit=head_sha,
896
+ )
870
897
 
871
898
  # ---- prepare directories + cleanup ----
872
899
  _ensure_task_directories(ctx)
@@ -1068,23 +1068,22 @@ def _list_implementation_planning_reports(
1068
1068
  """
1069
1069
  if not state.task_group or not state.task_id or not state.project_root:
1070
1070
  return []
1071
- base = (tasks_root(state.project_root)
1072
- / slugify_task_segment(state.task_group)
1073
- / slugify_task_segment(state.task_id)
1074
- / "runs" / "implementation-planning")
1075
- if not base.is_dir():
1071
+ # Run seq lives in the filename, not a per-run subdirectory: every
1072
+ # implementation-planning run writes into the same flat `reports/`
1073
+ # dir (see paths.py — `run_reports = runs/<task-type>/reports`).
1074
+ reports_dir = (tasks_root(state.project_root)
1075
+ / slugify_task_segment(state.task_group)
1076
+ / slugify_task_segment(state.task_id)
1077
+ / "runs" / "implementation-planning" / "reports")
1078
+ if not reports_dir.is_dir():
1076
1079
  return []
1077
1080
  pat = re.compile(r"^final-report-implementation-planning-(\d+)\.md$")
1078
1081
  found: list[tuple[int, Path]] = []
1079
- for run_dir in base.iterdir():
1080
- reports = run_dir / "reports"
1081
- if not reports.is_dir():
1082
+ for child in reports_dir.iterdir():
1083
+ m = pat.match(child.name)
1084
+ if not m:
1082
1085
  continue
1083
- for child in reports.iterdir():
1084
- m = pat.match(child.name)
1085
- if not m:
1086
- continue
1087
- found.append((int(m.group(1)), child))
1086
+ found.append((int(m.group(1)), child))
1088
1087
  found.sort(key=lambda x: -x[0])
1089
1088
  out: list[Path] = []
1090
1089
  for _, p in found[:limit]:
@@ -1,5 +1,5 @@
1
1
  #!/usr/bin/env python3
2
- """S1–S8 checks for the Stage Map structure of an approved
2
+ """S1–S9 checks for the Stage Map structure of an approved
3
3
  implementation-planning final-report.md. Run from prepare_task_bundle
4
4
  of `implementation` task or standalone."""
5
5
 
@@ -23,6 +23,11 @@ REQUIRED_SUBSECTIONS = (
23
23
  "Stage Validation",
24
24
  )
25
25
 
26
+ EXIT_CONTRACT_HEADING = re.compile(r"^###\s+Stage Exit Contract\b", re.M)
27
+ # best-effort path token: only slash-containing paths count as files, so
28
+ # endpoints (`/bar`), env vars (`BAZ_MODE`), and extensionless tokens are skipped.
29
+ PATH_TOKEN = re.compile(r"(?:[\w.@-]+/)+[\w.@-]+")
30
+
26
31
 
27
32
  @dataclass
28
33
  class StageMeta:
@@ -35,7 +40,7 @@ class StageMeta:
35
40
 
36
41
  @dataclass
37
42
  class ValidationError:
38
- code: str # S1..S8
43
+ code: str # S1..S9
39
44
  stage: int # 0 = global
40
45
  message: str
41
46
 
@@ -85,6 +90,20 @@ def _parse_stage_map(text: str) -> Tuple[List[StageMeta], List[ValidationError]]
85
90
  return rows, errors
86
91
 
87
92
 
93
+ def _slice_stage_section(text: str, stage_number: int) -> str:
94
+ """Return the body of `## 4.5.<n> Stage <n>:` up to the next stage heading."""
95
+ start_m = re.search(
96
+ rf"^##\s+4\.5\.{stage_number}\s+Stage\s+{stage_number}\s*:", text, re.M
97
+ )
98
+ if not start_m:
99
+ return ""
100
+ start = start_m.end()
101
+ nxt = re.search(
102
+ rf"^##\s+4\.5\.{stage_number + 1}\s+Stage\s+", text[start:], re.M
103
+ )
104
+ return text[start: start + nxt.start()] if nxt else text[start:]
105
+
106
+
88
107
  def _count_effective_steps(section: str) -> int:
89
108
  m = re.search(r"^###\s+Stepwise Execution Order\b", section, re.M)
90
109
  if not m:
@@ -114,19 +133,13 @@ def _count_effective_steps(section: str) -> int:
114
133
  def _check_each_stage_section(text: str, stages: List[StageMeta]) -> List[ValidationError]:
115
134
  errs: List[ValidationError] = []
116
135
  for s in stages:
117
- pattern = rf"^##\s+4\.5\.{s.stage_number}\s+Stage\s+{s.stage_number}\s*:"
118
- start_m = re.search(pattern, text, re.M)
119
- if not start_m:
136
+ if not re.search(
137
+ rf"^##\s+4\.5\.{s.stage_number}\s+Stage\s+{s.stage_number}\s*:", text, re.M
138
+ ):
120
139
  errs.append(ValidationError("S3", s.stage_number,
121
140
  f"stage section '## 4.5.{s.stage_number} Stage {s.stage_number}:' missing"))
122
141
  continue
123
- # Slice the stage's section body
124
- start = start_m.end()
125
- nxt = re.search(
126
- rf"^##\s+4\.5\.{s.stage_number + 1}\s+Stage\s+",
127
- text[start:], re.M,
128
- )
129
- section = text[start: start + nxt.start()] if nxt else text[start:]
142
+ section = _slice_stage_section(text, s.stage_number)
130
143
 
131
144
  for sub in REQUIRED_SUBSECTIONS:
132
145
  if not re.search(rf"^###\s+{re.escape(sub)}\b", section, re.M):
@@ -181,8 +194,42 @@ def _check_depends_on(stages: List[StageMeta]) -> List[ValidationError]:
181
194
  return errs
182
195
 
183
196
 
197
+ def _extract_exit_contract_files(section: str) -> set:
198
+ m = EXIT_CONTRACT_HEADING.search(section)
199
+ if not m:
200
+ return set()
201
+ body = section[m.end():]
202
+ nxt = re.search(r"^###\s+\w", body, re.M)
203
+ if nxt:
204
+ body = body[: nxt.start()]
205
+ return set(PATH_TOKEN.findall(body))
206
+
207
+
208
+ def _check_parallel_safety(text: str, stages: List[StageMeta]) -> List[ValidationError]:
209
+ """S9: two `depends-on (none)` stages must not predict the same file —
210
+ otherwise two parallel implementation runs would edit it concurrently."""
211
+ files = {
212
+ s.stage_number: _extract_exit_contract_files(
213
+ _slice_stage_section(text, s.stage_number)
214
+ )
215
+ for s in stages
216
+ if not s.depends_on
217
+ }
218
+ errs: List[ValidationError] = []
219
+ nums = sorted(files)
220
+ for i in range(len(nums)):
221
+ for j in range(i + 1, len(nums)):
222
+ a, b = nums[i], nums[j]
223
+ shared = files[a] & files[b]
224
+ if shared:
225
+ errs.append(ValidationError("S9", 0,
226
+ f"parallel stages {a} and {b} share predicted file(s): "
227
+ f"{', '.join(sorted(shared))}"))
228
+ return errs
229
+
230
+
184
231
  def collect_validation_errors(text: str) -> List[ValidationError]:
185
- """All S1–S8 checks against the report text; empty list means valid.
232
+ """All S1–S9 checks against the report text; empty list means valid.
186
233
 
187
234
  S1 (missing `## 4.5 Stage Map` heading) makes the rest unparseable, so it
188
235
  short-circuits. Shared by `main()` (CLI / implementation entry) and the
@@ -198,6 +245,7 @@ def collect_validation_errors(text: str) -> List[ValidationError]:
198
245
  if stages:
199
246
  errors.extend(_check_each_stage_section(text, stages))
200
247
  errors.extend(_check_depends_on(stages))
248
+ errors.extend(_check_parallel_safety(text, stages))
201
249
  return errors
202
250
 
203
251