okstra 0.45.1 → 0.46.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/docs/superpowers/plans/2026-06-04-stage-cohesion-planner.md +351 -0
- package/docs/superpowers/plans/2026-06-04-stage-run-batching.md +457 -0
- package/docs/superpowers/specs/2026-05-20-implementation-planning-multi-stage-design.md +2 -0
- package/docs/superpowers/specs/2026-06-04-stage-splitting-cost-aware-design.md +98 -0
- package/package.json +1 -1
- package/runtime/BUILD.json +2 -2
- package/runtime/prompts/launch.template.md +1 -0
- package/runtime/prompts/profiles/_implementation-deliverable.md +4 -3
- package/runtime/prompts/profiles/_implementation-executor.md +10 -12
- package/runtime/prompts/profiles/implementation-planning.md +3 -2
- package/runtime/prompts/profiles/implementation.md +1 -0
- package/runtime/python/okstra_ctl/render.py +7 -4
- package/runtime/python/okstra_ctl/run.py +69 -42
- package/runtime/validators/validate-implementation-plan-stages.py +61 -13
|
@@ -27,10 +27,9 @@ until Phase 5 ends, then drop from active context for Phase 6/7.
|
|
|
27
27
|
- Doc-only / config-only / pure-rename steps that have no observable runtime behaviour are exempt from the failing-test requirement, but the executor MUST cite the exemption per step in the final report (`TDD exemption: <reason>`).
|
|
28
28
|
- When the touched area has no existing test harness, the executor MUST stand up the minimum harness needed to host one regression test for this run rather than skipping TDD entirely. Record the harness-bootstrap step as an `Out-of-plan edit` if it is not in the plan.
|
|
29
29
|
- **DB / IO / SQL changes require real execution — mock-only is NOT validation evidence:** when this run's diff touches DB/IO/SQL (ORM / query-builder code — sequelize / typeorm / prisma / knex / raw SQL — `*.repository.*`, model/entity files, `migrations/**`, `*.sql`, or any changed query string), a mocked unit test cannot observe the SQL the query builder actually emits — a mocked suite once passed while `count({ col: 'FontFamily.fontFamily' })` threw `Unknown column` on the real DB. The executor MUST run the change against a real (or faithful-replica) datastore — the `db-test` validation step (plan `validation` db step, else `project.json.qaCommands.db-test`), targeting a **local / replica** DB — and cite its exact command + exit code in the final report's `Validation evidence`. If no real DB / `db-test` command is reachable, do NOT claim the change verified: label the DB portion `정적 분석상 …, 미검증(실행 안 함)` in the report, surface it in the routing recommendation, and never downplay the real run as "too heavy". `git push` stays forbidden (universal list); the unverified DB state is carried forward so `final-verification` cannot accept it and `release-handoff` cannot push.
|
|
30
|
-
- re-read the approved plan end-to-end and parse the `## 4.5 Stage Map`.
|
|
31
|
-
-
|
|
32
|
-
-
|
|
33
|
-
- extract the **start stage's** file list, step order, Stage Validation commands, Stage Exit Contract, and rollback path. These — not the whole plan — are the authoritative scope for this run.
|
|
30
|
+
- re-read the approved plan end-to-end and parse the `## 4.5 Stage Map`. Read the **Stage batch** injected in the launch prompt (`Stage batch for this implementation run`): it lists the stage numbers this run owns, ascending. The runtime already selected and reserved this batch — do NOT recompute the start stage from `consumers.jsonl`.
|
|
31
|
+
- for each stage in the batch, load every `runs/<plan-key>/carry/stage-<i>.json` for `i ∈ depends-on(stage)` and inject them into the executor's working context as "runtime carry-in". For `depends-on (none)` stages, no sidecar load — task-brief only.
|
|
32
|
+
- the batch's stages are mutually independent (each one's `depends-on` are all already `status:done`, never another batch member), so execute them in ascending order; each stage's file list, step order, Stage Validation commands, Stage Exit Contract, and rollback path are the authoritative scope for that stage.
|
|
34
33
|
- inspect the current state of every file the plan names; if any file has changed materially since the plan was written, stop and route to a new `implementation-planning` run instead of editing speculatively
|
|
35
34
|
- "materially changed" means: the function, class, section, or behaviour the plan targets has been edited, renamed, moved, removed, or otherwise altered in a way that invalidates the plan's reasoning. Cosmetic edits (whitespace, comment-only changes, unrelated function modifications elsewhere in the same file) do NOT trigger a re-plan; cite the diff (`git log --oneline <plan-created-at>..HEAD -- <file>`) in the final report and proceed.
|
|
36
35
|
- distinguish the two file-scope rules (they are not in conflict):
|
|
@@ -38,15 +37,14 @@ until Phase 5 ends, then drop from active context for Phase 6/7.
|
|
|
38
37
|
- **out-of-plan rule** (Allowed actions section below): if a step *requires touching a file NOT in the plan list*, that is permitted with `Out-of-plan edits` justification. This handles honest scope discovery during execution.
|
|
39
38
|
- confirm the test/build commands referenced in the plan still exist and run from a clean state
|
|
40
39
|
|
|
41
|
-
## Stage execution contract (this run owns
|
|
40
|
+
## Stage execution contract (this run owns the injected stage batch)
|
|
42
41
|
|
|
43
|
-
- **Sidecar evidence writer (BLOCKING).**
|
|
44
|
-
- **Reverse link (BLOCKING).**
|
|
45
|
-
- **One-PR-per-
|
|
46
|
-
- `## Stage
|
|
47
|
-
- `## Carry-In summary` — depends-on list + cited identifiers/SHAs from each loaded sidecar (omit when depends-on is empty).
|
|
48
|
-
- `## Next
|
|
49
|
-
Stage PRs link back to each other in their bodies (`Previous: #<n>, Next: #<m>` lines) so a reviewer can navigate the chain.
|
|
42
|
+
- **Sidecar evidence writer (BLOCKING, per stage).** For each stage in the batch, when that stage's Stage Validation `post` commands all succeed, the Executor MUST emit a JSON object matching the schema in `docs/superpowers/specs/2026-05-20-implementation-planning-multi-stage-design.md` §3.2 and the lead MUST persist it to `runs/<impl-task-key>/carry/stage-<N>.json`. Each file MUST NOT exist before the run starts (overwrite is refused — see `--force-stage` non-goal).
|
|
43
|
+
- **Reverse link (BLOCKING, per stage).** The runtime already appended a `status:"started"` row per batch stage before this run began. On each stage's completion, append a `status:"done"` row with `carry_path` populated for that stage number.
|
|
44
|
+
- **One-PR-per-run.** This run creates exactly one PR titled `Stages <first>–<last>: <run summary>` (or `Stage <N>: <title>` when the batch is a single stage). The PR body MUST include:
|
|
45
|
+
- `## Stage <N>` — one section per batched stage: number, title (from Stage Map row), touched files, and validation result.
|
|
46
|
+
- `## Carry-In summary` — per stage, depends-on list + cited identifiers/SHAs from each loaded sidecar (omit when depends-on is empty).
|
|
47
|
+
- `## Previous run` / `## Next run` — links so a reviewer can navigate the run chain.
|
|
50
48
|
|
|
51
49
|
## Allowed actions during the run
|
|
52
50
|
|
|
@@ -65,7 +65,8 @@
|
|
|
65
65
|
- `### Stepwise Execution Order` — bite-sized table with `step | action | files | command | expected`. **Effective row count ≤ 6** (excluding header / divider / blank). Each step is one action completable in 2–5 minutes; for code steps include actual code or diff sketch; prefer TDD ordering (failing test → implementation → green → commit).
|
|
66
66
|
- `### Stage Exit Contract` — predicted added/modified files, newly exposed identifiers/types/endpoints, downstream-usable resources.
|
|
67
67
|
- `### Stage Validation` — pre / mid / post exact commands or observable outcomes for this stage only.
|
|
68
|
-
- **
|
|
68
|
+
- **Cohesion-first partition rule (1st-class):** the grouping anchor is **shared file/module proximity** — steps touching the same file/directory/module go in the same stage so the diff, PR, and rollback unit are semantically cohesive. A stage is split ONLY when (a) a real `depends-on` data/contract dependency exists, (b) effective steps would exceed 6, or (c) the file sets are disjoint (unrelated work touching no shared file is not crammed together). Maximising the number of parallel stages is NOT a reason to split — parallelism is an emergent property of independent stages, never a partitioning goal.
|
|
69
|
+
- **Parallel-safety invariant (BLOCKING):** any two stages that are both `depends-on (none)` MUST predict disjoint file sets in their `Stage Exit Contract`. Two parallel `implementation` runs would otherwise edit the same file concurrently. Work touching a shared file must either go in one stage or be ordered with `depends-on`. Enforced by `validators/validate-implementation-plan-stages.py` check S9.
|
|
69
70
|
- **Stage exit contract is the carry surface:** keep it as narrow as possible. Wider surface = more downstream coupling.
|
|
70
71
|
- dependency / migration risk assessment (ordering constraints, data backfills, feature-flag prerequisites, repo-internal sequencing)
|
|
71
72
|
- validation checklist (pre / mid / post) — each item is an exact command or observable outcome
|
|
@@ -93,4 +94,4 @@
|
|
|
93
94
|
4. **Ambiguity check** — any requirement that could be read two ways must be made explicit or moved to the `## 5. Clarification Items` table as a `Blocks=approval` row.
|
|
94
95
|
5. **Scope check** — if the recommended plan now spans multiple independent subsystems, recommend splitting into separate planning runs rather than shipping an oversized plan.
|
|
95
96
|
6. **Plan-body verification reconciliation (BLOCKING for implementation-planning).** Inspect the `### 4.5.9 Plan Body Verification` verdict table. For every plan-item row classified as `majority-disagree → C-<N>`, the corresponding `C-<N>` row MUST exist in `## 5. Clarification Items` with `Kind` chosen per the standard policy and `Blocks=approval`. Do NOT create a parallel `### 4.5.x Open Questions` block — the unified table is the single home. Conversely, the `Classification` column's `C-<N>` reference and the `## 5. Clarification Items` `ID` column MUST match 1:1; an orphan on either side is a contract violation. For `partial-consensus` and `worker-unique` plan-items, the dissenting opinion lives in §4.5.9 `Dissent log` and is NOT promoted to §5.
|
|
96
|
-
7. **Stage Map self-check** — for every stage, count the effective rows of its `Stepwise Execution Order` table by hand; reject the draft if any stage exceeds 6. Walk the `depends-on` graph and confirm it is a DAG (no cycle, no self-reference). For each `depends-on` link,
|
|
97
|
+
7. **Stage Map self-check** — for every stage, count the effective rows of its `Stepwise Execution Order` table by hand; reject the draft if any stage exceeds 6. Walk the `depends-on` graph and confirm it is a DAG (no cycle, no self-reference). For each `depends-on` link, confirm it encodes a real data/contract dependency — do NOT add links to serialise unrelated work, and do NOT split a stage merely to create more parallel stages. **Parallel-safety:** for every pair of `depends-on (none)` stages, confirm their `Stage Exit Contract` predicted file sets are disjoint; if they share a file, merge them or add a `depends-on` link (validator S9 rejects overlap).
|
|
@@ -1,6 +1,7 @@
|
|
|
1
1
|
# Implementation Profile
|
|
2
2
|
|
|
3
3
|
- Purpose: realise the approved `implementation-planning` deliverable as actual source changes, with cross-model verification, while keeping the run reversible
|
|
4
|
+
- **Run-level fixed cost:** the verifier set, Phase 5.5 convergence, and the Phase 6 report-writer run exactly once per run, over the combined diff of all stages in this run's batch — never once per stage.
|
|
4
5
|
- Required workers:
|
|
5
6
|
- claude
|
|
6
7
|
- codex
|
|
@@ -1514,11 +1514,11 @@ def inject_lead_prompt_computed_tokens(ctx: dict) -> None:
|
|
|
1514
1514
|
def apply_lead_prompt_defaults(ctx: dict) -> None:
|
|
1515
1515
|
"""Apply default values for optional lead-prompt ctx fields.
|
|
1516
1516
|
|
|
1517
|
-
Sets
|
|
1517
|
+
Sets the optional tokens that the lead prompt template references but
|
|
1518
1518
|
which callers may legitimately leave unset (e.g., no validation has run
|
|
1519
|
-
yet, no related tasks were declared
|
|
1520
|
-
preserved via `setdefault` / `if-not-in`
|
|
1521
|
-
fills gaps, never overwrites.
|
|
1519
|
+
yet, no related tasks were declared, the run is not an implementation
|
|
1520
|
+
batch). Caller-supplied values are preserved via `setdefault` / `if-not-in`
|
|
1521
|
+
semantics — this function only fills gaps, never overwrites.
|
|
1522
1522
|
|
|
1523
1523
|
Companion to `inject_lead_prompt_computed_tokens` (which always
|
|
1524
1524
|
overwrites with deterministically-derived values). The two functions
|
|
@@ -1528,6 +1528,9 @@ def apply_lead_prompt_defaults(ctx: dict) -> None:
|
|
|
1528
1528
|
ctx.setdefault("VALIDATION_STATUS", "not-run")
|
|
1529
1529
|
ctx.setdefault("RELATED_TASKS_BULLETS", "- None recorded")
|
|
1530
1530
|
ctx.setdefault("RELATED_TASKS_INLINE", "None")
|
|
1531
|
+
# Empty for non-implementation runs; the implementation prepare path
|
|
1532
|
+
# overwrites it with the resolved stage-batch directive.
|
|
1533
|
+
ctx.setdefault("STAGE_BATCH_DIRECTIVE", "")
|
|
1531
1534
|
ctx.setdefault(
|
|
1532
1535
|
"WORKER_PROMPT_PREAMBLE_PATH",
|
|
1533
1536
|
str(Path.home() / ".okstra" / "templates" / "worker-prompt-preamble.md"),
|
|
@@ -208,42 +208,58 @@ def _validate_stage_structure(plan_path: str) -> None:
|
|
|
208
208
|
)
|
|
209
209
|
|
|
210
210
|
|
|
211
|
-
|
|
211
|
+
RUN_STEP_BUDGET = 8
|
|
212
|
+
|
|
213
|
+
|
|
214
|
+
def _resolve_effective_stages(
|
|
212
215
|
stages: list,
|
|
213
216
|
done_stages: set,
|
|
214
217
|
requested: str,
|
|
215
|
-
|
|
216
|
-
|
|
218
|
+
budget: int = RUN_STEP_BUDGET,
|
|
219
|
+
) -> list:
|
|
220
|
+
"""Return the ordered list of stage numbers this run executes.
|
|
221
|
+
|
|
222
|
+
`requested` is "auto" or a decimal string. For "auto" the run batches all
|
|
223
|
+
ready stages (depends-on all done, itself not done) in stage-number order up
|
|
224
|
+
to `budget` effective steps — but always at least one. A numeric request is a
|
|
225
|
+
single forced stage. Raises PrepareError on rejection cases."""
|
|
226
|
+
if requested != "auto":
|
|
227
|
+
try:
|
|
228
|
+
n = int(requested)
|
|
229
|
+
except ValueError:
|
|
230
|
+
raise PrepareError(
|
|
231
|
+
f"--stage must be 'auto' or an integer, got {requested!r}"
|
|
232
|
+
)
|
|
233
|
+
target = next((s for s in stages if s["stage_number"] == n), None)
|
|
234
|
+
if target is None:
|
|
235
|
+
raise PrepareError(
|
|
236
|
+
f"--stage {n} not in Stage Map "
|
|
237
|
+
f"(have {[s['stage_number'] for s in stages]})"
|
|
238
|
+
)
|
|
239
|
+
if n in done_stages:
|
|
240
|
+
raise PrepareError(
|
|
241
|
+
f"--stage {n} already completed (consumers.jsonl status:done exists)"
|
|
242
|
+
)
|
|
243
|
+
return [n]
|
|
217
244
|
|
|
218
|
-
|
|
219
|
-
|
|
220
|
-
|
|
221
|
-
|
|
222
|
-
|
|
223
|
-
|
|
224
|
-
continue
|
|
225
|
-
if all(d in done_stages for d in s["depends_on"]):
|
|
226
|
-
return s["stage_number"]
|
|
245
|
+
ready = [
|
|
246
|
+
s for s in stages
|
|
247
|
+
if s["stage_number"] not in done_stages
|
|
248
|
+
and all(d in done_stages for d in s["depends_on"])
|
|
249
|
+
]
|
|
250
|
+
if not ready:
|
|
227
251
|
raise PrepareError(
|
|
228
252
|
"no stage is ready: every remaining stage has unsatisfied depends-on"
|
|
229
253
|
)
|
|
230
|
-
|
|
231
|
-
|
|
232
|
-
|
|
233
|
-
|
|
234
|
-
|
|
235
|
-
|
|
236
|
-
|
|
237
|
-
|
|
238
|
-
|
|
239
|
-
f"--stage {n} not in Stage Map "
|
|
240
|
-
f"(have {[s['stage_number'] for s in stages]})"
|
|
241
|
-
)
|
|
242
|
-
if n in done_stages:
|
|
243
|
-
raise PrepareError(
|
|
244
|
-
f"--stage {n} already completed (consumers.jsonl status:done exists)"
|
|
245
|
-
)
|
|
246
|
-
return n
|
|
254
|
+
batch: list = []
|
|
255
|
+
total = 0
|
|
256
|
+
for s in ready:
|
|
257
|
+
sc = s.get("step_count", 0) or 0
|
|
258
|
+
if batch and total + sc > budget:
|
|
259
|
+
break
|
|
260
|
+
batch.append(s["stage_number"])
|
|
261
|
+
total += sc
|
|
262
|
+
return batch
|
|
247
263
|
|
|
248
264
|
|
|
249
265
|
def _parse_stage_map_into_ctx(plan_path: str) -> list:
|
|
@@ -842,31 +858,42 @@ def prepare_task_bundle(inp: PrepareInputs) -> PrepareOutputs:
|
|
|
842
858
|
})
|
|
843
859
|
if inp.task_type == "implementation":
|
|
844
860
|
ctx["parsed_stage_map"] = ctx_stage_map
|
|
845
|
-
# Resolve
|
|
861
|
+
# Resolve the ready-set batch and append a `started` row per batched stage.
|
|
846
862
|
from .consumers import read_consumers, append_consumer
|
|
847
863
|
import datetime as _dt
|
|
848
864
|
plan_run_root = Path(inp.approved_plan_path).resolve().parents[1]
|
|
849
865
|
consumed = read_consumers(plan_run_root)
|
|
850
866
|
done_stages = {r["stage"] for r in consumed if r.get("status") == "done"}
|
|
851
|
-
effective =
|
|
867
|
+
effective = _resolve_effective_stages(
|
|
852
868
|
ctx["parsed_stage_map"], done_stages, inp.stage
|
|
853
869
|
)
|
|
854
|
-
ctx["
|
|
855
|
-
|
|
856
|
-
|
|
870
|
+
ctx["effective_stages"] = effective
|
|
871
|
+
csv = ",".join(str(n) for n in effective)
|
|
872
|
+
ctx["EFFECTIVE_STAGES"] = csv
|
|
873
|
+
ctx["STAGE_BATCH_DIRECTIVE"] = (
|
|
874
|
+
f"- **Stage batch for this implementation run:** `{csv}` "
|
|
875
|
+
"(comma-separated stage numbers, ascending). Execute exactly these "
|
|
876
|
+
"Stage Map stages in this order — this is the authoritative scope. "
|
|
877
|
+
"Do NOT recompute the start stage from `consumers.jsonl`; the runtime "
|
|
878
|
+
"already selected and reserved this batch."
|
|
879
|
+
)
|
|
880
|
+
inp.stage = csv
|
|
881
|
+
print(f"selected stages: {csv}", file=sys.stdout)
|
|
857
882
|
head_proc = _subprocess.run(
|
|
858
883
|
["git", "rev-parse", "HEAD"],
|
|
859
884
|
cwd=inp.project_root, capture_output=True, text=True,
|
|
860
885
|
)
|
|
861
886
|
head_sha = head_proc.stdout.strip() if head_proc.returncode == 0 else ""
|
|
862
|
-
|
|
863
|
-
|
|
864
|
-
|
|
865
|
-
|
|
866
|
-
|
|
867
|
-
|
|
868
|
-
|
|
869
|
-
|
|
887
|
+
now = _dt.datetime.now(_dt.timezone.utc).isoformat()
|
|
888
|
+
for stage_n in effective:
|
|
889
|
+
append_consumer(
|
|
890
|
+
plan_run_root,
|
|
891
|
+
impl_task_key=ctx["TASK_KEY"],
|
|
892
|
+
stage=stage_n,
|
|
893
|
+
status="started",
|
|
894
|
+
started_at=now,
|
|
895
|
+
head_commit=head_sha,
|
|
896
|
+
)
|
|
870
897
|
|
|
871
898
|
# ---- prepare directories + cleanup ----
|
|
872
899
|
_ensure_task_directories(ctx)
|
|
@@ -1,5 +1,5 @@
|
|
|
1
1
|
#!/usr/bin/env python3
|
|
2
|
-
"""S1–
|
|
2
|
+
"""S1–S9 checks for the Stage Map structure of an approved
|
|
3
3
|
implementation-planning final-report.md. Run from prepare_task_bundle
|
|
4
4
|
of `implementation` task or standalone."""
|
|
5
5
|
|
|
@@ -23,6 +23,11 @@ REQUIRED_SUBSECTIONS = (
|
|
|
23
23
|
"Stage Validation",
|
|
24
24
|
)
|
|
25
25
|
|
|
26
|
+
EXIT_CONTRACT_HEADING = re.compile(r"^###\s+Stage Exit Contract\b", re.M)
|
|
27
|
+
# best-effort path token: only slash-containing paths count as files, so
|
|
28
|
+
# endpoints (`/bar`), env vars (`BAZ_MODE`), and extensionless tokens are skipped.
|
|
29
|
+
PATH_TOKEN = re.compile(r"(?:[\w.@-]+/)+[\w.@-]+")
|
|
30
|
+
|
|
26
31
|
|
|
27
32
|
@dataclass
|
|
28
33
|
class StageMeta:
|
|
@@ -35,7 +40,7 @@ class StageMeta:
|
|
|
35
40
|
|
|
36
41
|
@dataclass
|
|
37
42
|
class ValidationError:
|
|
38
|
-
code: str # S1..
|
|
43
|
+
code: str # S1..S9
|
|
39
44
|
stage: int # 0 = global
|
|
40
45
|
message: str
|
|
41
46
|
|
|
@@ -85,6 +90,20 @@ def _parse_stage_map(text: str) -> Tuple[List[StageMeta], List[ValidationError]]
|
|
|
85
90
|
return rows, errors
|
|
86
91
|
|
|
87
92
|
|
|
93
|
+
def _slice_stage_section(text: str, stage_number: int) -> str:
|
|
94
|
+
"""Return the body of `## 4.5.<n> Stage <n>:` up to the next stage heading."""
|
|
95
|
+
start_m = re.search(
|
|
96
|
+
rf"^##\s+4\.5\.{stage_number}\s+Stage\s+{stage_number}\s*:", text, re.M
|
|
97
|
+
)
|
|
98
|
+
if not start_m:
|
|
99
|
+
return ""
|
|
100
|
+
start = start_m.end()
|
|
101
|
+
nxt = re.search(
|
|
102
|
+
rf"^##\s+4\.5\.{stage_number + 1}\s+Stage\s+", text[start:], re.M
|
|
103
|
+
)
|
|
104
|
+
return text[start: start + nxt.start()] if nxt else text[start:]
|
|
105
|
+
|
|
106
|
+
|
|
88
107
|
def _count_effective_steps(section: str) -> int:
|
|
89
108
|
m = re.search(r"^###\s+Stepwise Execution Order\b", section, re.M)
|
|
90
109
|
if not m:
|
|
@@ -114,19 +133,13 @@ def _count_effective_steps(section: str) -> int:
|
|
|
114
133
|
def _check_each_stage_section(text: str, stages: List[StageMeta]) -> List[ValidationError]:
|
|
115
134
|
errs: List[ValidationError] = []
|
|
116
135
|
for s in stages:
|
|
117
|
-
|
|
118
|
-
|
|
119
|
-
|
|
136
|
+
if not re.search(
|
|
137
|
+
rf"^##\s+4\.5\.{s.stage_number}\s+Stage\s+{s.stage_number}\s*:", text, re.M
|
|
138
|
+
):
|
|
120
139
|
errs.append(ValidationError("S3", s.stage_number,
|
|
121
140
|
f"stage section '## 4.5.{s.stage_number} Stage {s.stage_number}:' missing"))
|
|
122
141
|
continue
|
|
123
|
-
|
|
124
|
-
start = start_m.end()
|
|
125
|
-
nxt = re.search(
|
|
126
|
-
rf"^##\s+4\.5\.{s.stage_number + 1}\s+Stage\s+",
|
|
127
|
-
text[start:], re.M,
|
|
128
|
-
)
|
|
129
|
-
section = text[start: start + nxt.start()] if nxt else text[start:]
|
|
142
|
+
section = _slice_stage_section(text, s.stage_number)
|
|
130
143
|
|
|
131
144
|
for sub in REQUIRED_SUBSECTIONS:
|
|
132
145
|
if not re.search(rf"^###\s+{re.escape(sub)}\b", section, re.M):
|
|
@@ -181,8 +194,42 @@ def _check_depends_on(stages: List[StageMeta]) -> List[ValidationError]:
|
|
|
181
194
|
return errs
|
|
182
195
|
|
|
183
196
|
|
|
197
|
+
def _extract_exit_contract_files(section: str) -> set:
|
|
198
|
+
m = EXIT_CONTRACT_HEADING.search(section)
|
|
199
|
+
if not m:
|
|
200
|
+
return set()
|
|
201
|
+
body = section[m.end():]
|
|
202
|
+
nxt = re.search(r"^###\s+\w", body, re.M)
|
|
203
|
+
if nxt:
|
|
204
|
+
body = body[: nxt.start()]
|
|
205
|
+
return set(PATH_TOKEN.findall(body))
|
|
206
|
+
|
|
207
|
+
|
|
208
|
+
def _check_parallel_safety(text: str, stages: List[StageMeta]) -> List[ValidationError]:
|
|
209
|
+
"""S9: two `depends-on (none)` stages must not predict the same file —
|
|
210
|
+
otherwise two parallel implementation runs would edit it concurrently."""
|
|
211
|
+
files = {
|
|
212
|
+
s.stage_number: _extract_exit_contract_files(
|
|
213
|
+
_slice_stage_section(text, s.stage_number)
|
|
214
|
+
)
|
|
215
|
+
for s in stages
|
|
216
|
+
if not s.depends_on
|
|
217
|
+
}
|
|
218
|
+
errs: List[ValidationError] = []
|
|
219
|
+
nums = sorted(files)
|
|
220
|
+
for i in range(len(nums)):
|
|
221
|
+
for j in range(i + 1, len(nums)):
|
|
222
|
+
a, b = nums[i], nums[j]
|
|
223
|
+
shared = files[a] & files[b]
|
|
224
|
+
if shared:
|
|
225
|
+
errs.append(ValidationError("S9", 0,
|
|
226
|
+
f"parallel stages {a} and {b} share predicted file(s): "
|
|
227
|
+
f"{', '.join(sorted(shared))}"))
|
|
228
|
+
return errs
|
|
229
|
+
|
|
230
|
+
|
|
184
231
|
def collect_validation_errors(text: str) -> List[ValidationError]:
|
|
185
|
-
"""All S1–
|
|
232
|
+
"""All S1–S9 checks against the report text; empty list means valid.
|
|
186
233
|
|
|
187
234
|
S1 (missing `## 4.5 Stage Map` heading) makes the rest unparseable, so it
|
|
188
235
|
short-circuits. Shared by `main()` (CLI / implementation entry) and the
|
|
@@ -198,6 +245,7 @@ def collect_validation_errors(text: str) -> List[ValidationError]:
|
|
|
198
245
|
if stages:
|
|
199
246
|
errors.extend(_check_each_stage_section(text, stages))
|
|
200
247
|
errors.extend(_check_depends_on(stages))
|
|
248
|
+
errors.extend(_check_parallel_safety(text, stages))
|
|
201
249
|
return errors
|
|
202
250
|
|
|
203
251
|
|