@windyroad/itil 0.47.11 → 0.47.12-preview.598

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -497,5 +497,5 @@
497
497
  }
498
498
  },
499
499
  "name": "wr-itil",
500
- "version": "0.47.11"
500
+ "version": "0.47.12"
501
501
  }
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@windyroad/itil",
3
- "version": "0.47.11",
3
+ "version": "0.47.12-preview.598",
4
4
  "description": "ITIL-aligned IT service management for Claude Code (problem, and future incident/change skills)",
5
5
  "bin": {
6
6
  "windyroad-itil": "./bin/install.mjs"
@@ -694,7 +694,7 @@ fi
694
694
 
695
695
  Detection is intentionally **strict** (explicit label or scoped-npm package only) to avoid prompt fatigue (P063 Direction decision). A passing reference to a bare package name (`gh`, `npm`) does NOT trigger the prompt.
696
696
 
697
- **Already-noted check** — before firing the prompt, grep the ticket for the stable marker `- **Upstream report pending** —` (written by option 2 / the AFK fallback below) or `- **Reported Upstream:**` / a `## Reported Upstream` section (written by `/wr-itil:report-upstream` Step 7 back-write per ADR-024 Confirmation criterion 3a). If any of those are already present, skip the prompt — the detection has already fired on a prior run.
697
+ **Already-noted check** — before firing the prompt, grep the ticket for the stable marker `- **Upstream report pending** --` (canonical ASCII form per P210) or the legacy em-dash variant `- **Upstream report pending** —` (written by option 2 / the AFK fallback below; the grep MUST match BOTH variants for backward compatibility) or `- **Reported Upstream:**` / a `## Reported Upstream` section (written by `/wr-itil:report-upstream` Step 7 back-write per ADR-024 Confirmation criterion 3a). If any of those are already present, skip the prompt — the detection has already fired on a prior run.
698
698
 
699
699
  **If the detection fires and nothing has been noted yet**, use `AskUserQuestion`:
700
700
 
@@ -702,15 +702,15 @@ Detection is intentionally **strict** (explicit label or scoped-npm package only
702
702
  - `multiSelect: false`
703
703
  - Options:
704
704
  1. `Invoke /wr-itil:report-upstream now` — halt the transition; the skill runs (it writes the `## Reported Upstream` appendage per ADR-024 Confirmation criterion 3a); the transition resumes afterwards.
705
- 2. `Defer and note in ticket` — append a pending-upstream-report line to the ticket's `## Related` section using the stable marker `- **Upstream report pending** external dependency identified; invoke /wr-itil:report-upstream when ready`. The marker wording is fixed so subsequent runs (and the work-problems `upstream-blocked` skip path) can detect "already noted" without re-firing.
706
- 3. `Not actually upstream` — proceed without invocation; append the same marker with text `- **Upstream report pending** false positive; detection misfire` so the prompt does not re-fire on later reviews.
705
+ 2. `Defer and note in ticket` — append a pending-upstream-report line to the ticket's `## Related` section using the stable marker `- **Upstream report pending** -- external dependency identified; invoke /wr-itil:report-upstream when ready`. The marker wording is fixed (ASCII `--` per P210 — ASCII-only in machine-parseable identifiers; em-dash permitted in pure narrative prose) so subsequent runs (and the work-problems `upstream-blocked` skip path) can detect "already noted" without re-firing.
706
+ 3. `Not actually upstream` — proceed without invocation; append the same marker with text `- **Upstream report pending** -- false positive; detection misfire` so the prompt does not re-fire on later reviews.
707
707
 
708
708
  **Non-interactive (AFK) branch** (per ADR-013 Rule 6 + ADR-024 2026-06-04 (P270) amendment): when `AskUserQuestion` is unavailable, **auto-invoke `/wr-itil:report-upstream`** instead of deferring with the marker. The skill composes the report draft via its own Steps 1–5/4b/5c/6 then scores the drafted prose via the `wr-risk-scorer:external-comms` agent (ADR-028) per the ADR-024 2026-06-04 amendment's orchestrator-side pre-fire gate. Branches:
709
709
 
710
710
  - **Below external-comms appetite** → the skill proceeds (public-issue path Step 5, comment path Step 5c, or security path Step 6 per the existing classification routing); commits the `## Reported Upstream` back-write per Step 7 / Step 8.
711
711
  - **Above appetite** → the skill takes risk-reducing measures (per ADR-042 within-axis precedent generalised to the external-comms risk class — the measures vocabulary is **open-ended LLM judgement** per ADR-024 2026-06-04 second-amendment ratification leaf (a): the `wr-risk-scorer:external-comms` agent's own scoring picks the remedy case-by-case, matching ADR-042's open-vocabulary precedent — NOT a bounded enumeration); re-scores; if within appetite → sends; else → **queues** an `outstanding_questions` entry naming the local ticket ID + queued report path + risk-reduce attempts + residual band + remedy ("review the queued report at `/wr-itil:report-upstream <NNN> <upstream-repo-url>` on return"). The orchestrator continues (P352 queue-and-continue). The `## Queued Upstream Report` section (renamed from `## Drafted Upstream Report` per ADR-024 2026-06-04 second-amendment leaf (c) — same shape; new name reflects the queue-for-review-on-return semantics) carries the report content for the queued question's reference. Security-path routing follows leaf (b) ratification: upstream-with-`SECURITY.md` + below-appetite → file via the declared channel; upstream-without-`SECURITY.md` but with another disclosure channel → external-comms-gated assessment considering impact to (i) our repository, (ii) our reputation, (iii) the party we are reporting to.
712
712
 
713
- The legacy `- **Upstream report pending** —` marker append (the pre-2026-06-04 AFK default) is **superseded** by this auto-invoke branch for all classifications including security. Tickets that already carry the marker from prior sessions are still handled correctly by the work-problems Step 4 classifier — the new path's "already-noted check" matches the legacy marker shape and routes to the report-upstream invocation. The marker shape is retained for backward compatibility on the parking + interactive fallback paths (interactive option 2 still appends it; see options 1/2/3 above).
713
+ The legacy `- **Upstream report pending** --` marker append (canonical ASCII per P210; em-dash variant is the pre-P210 form, still matched for backward compatibility) — the pre-2026-06-04 AFK default is **superseded** by this auto-invoke branch for all classifications including security. Tickets that already carry the marker from prior sessions (either form) are still handled correctly by the work-problems Step 4 classifier — the new path's "already-noted check" matches both variants and routes to the report-upstream invocation. The marker shape is retained for backward compatibility on the parking + interactive fallback paths (interactive option 2 still appends it; see options 1/2/3 above).
714
714
 
715
715
  **Scope**: this detection block fires at two points —
716
716
 
@@ -6,9 +6,12 @@
6
6
  #
7
7
  # Doc-lint structural test (Permitted Exception per ADR-005) — asserts
8
8
  # SKILL.md wording for detection tokens, AskUserQuestion three-option
9
- # prompt, AFK fallback, and the stable `- **Upstream report pending** —`
10
- # marker. Mirrors work-problems-release-cadence.bats and
11
- # report-upstream-contract.bats patterns.
9
+ # prompt, AFK fallback, and the stable `- **Upstream report pending** --`
10
+ # marker (canonical ASCII form per P210; the legacy em-dash variant is
11
+ # still matched by the SKILL's already-noted check for backward
12
+ # compatibility, but is not the canonical-write target). Mirrors
13
+ # work-problems-release-cadence.bats and report-upstream-contract.bats
14
+ # patterns.
12
15
 
13
16
  setup() {
14
17
  REPO_ROOT="$(cd "$(dirname "$BATS_TEST_FILENAME")/../../../../.." && pwd)"
@@ -40,8 +43,17 @@ setup() {
40
43
  [ "$status" -eq 0 ]
41
44
  }
42
45
 
43
- @test "manage-problem: SKILL.md defines the stable Upstream report pending marker with fixed wording" {
44
- run grep -F -- '- **Upstream report pending** external dependency identified; invoke /wr-itil:report-upstream when ready' "$MP_SKILL"
46
+ @test "manage-problem: SKILL.md defines the stable Upstream report pending marker with fixed wording (canonical ASCII per P210)" {
47
+ run grep -F -- '- **Upstream report pending** -- external dependency identified; invoke /wr-itil:report-upstream when ready' "$MP_SKILL"
48
+ [ "$status" -eq 0 ]
49
+ }
50
+
51
+ @test "manage-problem: SKILL.md still references the legacy em-dash marker variant for backward compatibility (P210)" {
52
+ # P210: canonical write form is ASCII `--`, but the already-noted
53
+ # check MUST still match the legacy em-dash variant so tickets
54
+ # written in prior sessions are detected correctly. Asserts the
55
+ # legacy form remains documented in the SKILL prose.
56
+ run grep -F -- '- **Upstream report pending** —' "$MP_SKILL"
45
57
  [ "$status" -eq 0 ]
46
58
  }
47
59
 
@@ -101,7 +113,7 @@ setup() {
101
113
  [ "$status" -eq 0 ]
102
114
  }
103
115
 
104
- @test "work-problems: uses the same stable marker wording as manage-problem" {
105
- run grep -F -- '- **Upstream report pending** external dependency identified; invoke /wr-itil:report-upstream when ready' "$WP_SKILL"
116
+ @test "work-problems: uses the same stable marker wording as manage-problem (canonical ASCII per P210)" {
117
+ run grep -F -- '- **Upstream report pending** -- external dependency identified; invoke /wr-itil:report-upstream when ready' "$WP_SKILL"
106
118
  [ "$status" -eq 0 ]
107
119
  }
@@ -112,21 +112,23 @@ fi
112
112
 
113
113
  Detection is intentionally **strict** (explicit label or scoped-npm package only) to avoid prompt fatigue (P063 Direction decision). A passing reference to a bare package name (`gh`, `npm`) does NOT trigger the prompt.
114
114
 
115
- **Already-noted check** — before firing the prompt, grep the ticket for the stable marker `- **Upstream report pending** —` (written by option 2 / the AFK fallback below) or `- **Reported Upstream:**` / a `## Reported Upstream` section (written by `/wr-itil:report-upstream` Step 7 back-write per ADR-024 Confirmation criterion 3a). If any of those are already present, skip the prompt — the detection has already fired on a prior run.
115
+ **Already-noted check** — before firing the prompt, grep the ticket for the stable marker `- **Upstream report pending** --` (canonical ASCII form per P210) or the legacy em-dash variant `- **Upstream report pending** —` (written by option 2 / the AFK fallback below; the grep MUST match BOTH variants for backward compatibility) or `- **Reported Upstream:**` / a `## Reported Upstream` section (written by `/wr-itil:report-upstream` Step 7 back-write per ADR-024 Confirmation criterion 3a). If any of those are already present, skip the prompt — the detection has already fired on a prior run.
116
116
 
117
117
  **If the detection fires and nothing has been noted yet** (per ADR-044 framework-resolution boundary): the agent applies the AFK fallback default WITHOUT firing `AskUserQuestion`. Per ADR-044, this decision IS framework-resolved — the safe action is "defer and note marker", and the user can correct via authentic-correction (ADR-044 category 6) if a manual `/wr-itil:report-upstream` invocation is wanted instead. Per-transition `AskUserQuestion` for upstream-detection is sub-contracting framework-resolved decisions back to the user (lazy deferral per Step 2d Ask Hygiene Pass classification).
118
118
 
119
119
  **Default behaviour (silent agent action, per ADR-044)**: append the pending-upstream-report line to the ticket's `## Related` section using the stable marker:
120
120
 
121
121
  ```
122
- - **Upstream report pending** external dependency identified; invoke /wr-itil:report-upstream when ready
122
+ - **Upstream report pending** -- external dependency identified; invoke /wr-itil:report-upstream when ready
123
123
  ```
124
124
 
125
+ ASCII `--` per P210 — ASCII-only in machine-parseable identifiers; em-dash permitted in pure narrative prose. The legacy em-dash variant is matched by the already-noted check for backward compatibility.
126
+
125
127
  The marker wording is fixed so subsequent runs (and the work-problems `upstream-blocked` skip path) can detect "already noted" without re-firing. The transition proceeds normally after the marker is appended.
126
128
 
127
129
  **Recovery / override paths** (user-initiated, not asked-per-transition):
128
130
 
129
- - If the detection misfired (false positive — not actually upstream), user appends `- **Upstream report pending** false positive; detection misfire` directly to the ticket's `## Related` section. The next detection-pass observes the marker and skips firing again.
131
+ - If the detection misfired (false positive — not actually upstream), user appends `- **Upstream report pending** -- false positive; detection misfire` directly to the ticket's `## Related` section (ASCII `--` per P210; legacy em-dash variant remains matched for backward compatibility). The next detection-pass observes the marker and skips firing again.
130
132
  - If the user wants to invoke `/wr-itil:report-upstream` immediately rather than deferring, they invoke it directly (`/wr-itil:report-upstream <NNN> <upstream-repo-url>`). The skill writes the `## Reported Upstream` appendage per ADR-024.
131
133
 
132
134
  **AFK and interactive modes use identical behaviour** — the silent-default-with-recovery-path shape is the framework-resolution boundary application; there's no `AskUserQuestion`-vs-fallback differentiation.
@@ -120,12 +120,12 @@ if grep -iE '\b(upstream|third-party|external|vendor)\b|@[[:alnum:]_-]+/[[:alnum
120
120
  fi
121
121
  ```
122
122
 
123
- **Already-noted check** — before firing, grep for `- **Upstream report pending** —` or `- **Reported Upstream:**` or a `## Reported Upstream` section. If present, skip the prompt for this pair.
123
+ **Already-noted check** — before firing, grep for `- **Upstream report pending** --` (canonical ASCII per P210) or the legacy em-dash variant `- **Upstream report pending** —` (the grep MUST match BOTH variants for backward compatibility) or `- **Reported Upstream:**` or a `## Reported Upstream` section. If present, skip the prompt for this pair.
124
124
 
125
125
  **Branch on interactivity (per ADR-013 Rule 1 / Rule 6):**
126
126
 
127
127
  - **Interactive** (`AskUserQuestion` available): use the same three-option prompt the singular's Step 5 documents (invoke /wr-itil:report-upstream / defer-and-note / not-actually-upstream).
128
- - **AFK / non-interactive** (orchestrator markers — "AFK", "work-problems", "batch-work", "ALL_DONE" — present in the invoking context): default to defer-and-note. Append `- **Upstream report pending** external dependency identified; invoke /wr-itil:report-upstream when ready` to the ticket's `## Related` section. Do NOT auto-invoke `/wr-itil:report-upstream` (its Step 6 security branch is interactive — per ADR-024).
128
+ - **AFK / non-interactive** (orchestrator markers — "AFK", "work-problems", "batch-work", "ALL_DONE" — present in the invoking context): default to defer-and-note. Append `- **Upstream report pending** -- external dependency identified; invoke /wr-itil:report-upstream when ready` (canonical ASCII per P210) to the ticket's `## Related` section. Do NOT auto-invoke `/wr-itil:report-upstream` (its Step 6 security branch is interactive — per ADR-024).
129
129
 
130
130
  The detection is per-pair; each Open → Known Error pair runs its own check independently.
131
131
 
@@ -385,7 +385,7 @@ Read the problem file and apply these deterministic rules:
385
385
  | Problem previously attempted twice without progress in this session | **Skip** — mark as stuck, needs interactive attention | user-answerable (direction) |
386
386
  | Open problem with outstanding user-answerable design question (naming, direction, pacing, scope) | **Skip** — surface the question at stop (Step 2.5) | user-answerable (design) |
387
387
  | Open problem needing architect design judgment (new-ADR-level question) | **Skip** — note the architect-design blocker; Step 2.5 may elevate via a pre-triggered architect call in `--deep-stop` mode | architect-design |
388
- | Open problem blocked on upstream dependency or Claude Code capability gap | **Auto-invoke `/wr-itil:report-upstream` via the AFK fallback** (per ADR-024 2026-06-04 (P270) amendment — manage-problem Step 6 external-root-cause detection AFK fallback owns the actual invocation; this row routes through it). The report-upstream skill composes the draft then scores the prose via `wr-risk-scorer:external-comms` (ADR-028); below-appetite → sends; above-appetite → risk-reduces (open-ended LLM judgement per ADR-024 2026-06-04 second-amendment leaf (a)) then re-scores → sends-or-queues. Security routing per leaf (b): upstream-with-`SECURITY.md` + below-appetite → files via declared channel; upstream-without-`SECURITY.md` → external-comms-gated impact assessment to (i) our repo, (ii) our reputation, (iii) reported party. Queued reports save to `## Queued Upstream Report` (renamed from `## Drafted Upstream Report` per leaf (c)). Queue does NOT halt — outstanding_question surfaces at Step 2.4 / Step 2.5b end-of-loop per P352. Iter still classifies the ticket as `upstream-blocked` (the local ticket itself is still blocked on the upstream fix) and **skips work on it** after the report-upstream invocation completes — the report-upstream call is the action this row takes; classification stays `upstream-blocked` so Step 4 routes to skip-rather-than-work. Tickets already carrying `- **Upstream report pending** —` from prior sessions are detected via the already-noted check and routed to the report-upstream invocation (the marker shape is retained as the detection substrate per the 2026-06-04 amendment). | upstream-blocked |
388
+ | Open problem blocked on upstream dependency or Claude Code capability gap | **Auto-invoke `/wr-itil:report-upstream` via the AFK fallback** (per ADR-024 2026-06-04 (P270) amendment — manage-problem Step 6 external-root-cause detection AFK fallback owns the actual invocation; this row routes through it). The report-upstream skill composes the draft then scores the prose via `wr-risk-scorer:external-comms` (ADR-028); below-appetite → sends; above-appetite → risk-reduces (open-ended LLM judgement per ADR-024 2026-06-04 second-amendment leaf (a)) then re-scores → sends-or-queues. Security routing per leaf (b): upstream-with-`SECURITY.md` + below-appetite → files via declared channel; upstream-without-`SECURITY.md` → external-comms-gated impact assessment to (i) our repo, (ii) our reputation, (iii) reported party. Queued reports save to `## Queued Upstream Report` (renamed from `## Drafted Upstream Report` per leaf (c)). Queue does NOT halt — outstanding_question surfaces at Step 2.4 / Step 2.5b end-of-loop per P352. Iter still classifies the ticket as `upstream-blocked` (the local ticket itself is still blocked on the upstream fix) and **skips work on it** after the report-upstream invocation completes — the report-upstream call is the action this row takes; classification stays `upstream-blocked` so Step 4 routes to skip-rather-than-work. Tickets already carrying `- **Upstream report pending** --` (or the legacy em-dash variant) from prior sessions are detected via the already-noted check and routed to the report-upstream invocation (the marker shape is retained as the detection substrate per the 2026-06-04 amendment; ASCII `--` is the canonical form per P210, em-dash is the legacy form, both matched). | upstream-blocked |
389
389
 
390
390
  The default is to work the problem. Only skip when the rule explicitly says so. This is an AFK loop — forward progress matters more than avoiding dead ends, because dead ends are cheap (findings are saved) and interactive input is expensive (user is absent).
391
391
 
@@ -393,7 +393,7 @@ The default is to work the problem. Only skip when the rule explicitly says so.
393
393
 
394
394
  - **user-answerable** — the user can answer directly (verification, naming, direction, pacing, scope). Step 2.5 surfaces these as questions (interactive) or in the Outstanding Design Questions table (non-interactive / AFK).
395
395
  - **architect-design** — requires architect judgment first; may escalate to a new ADR. Step 2.5 can optionally pre-trigger the architect agent in `--deep-stop` mode to produce a concrete user-answerable question. Otherwise noted as "pending architect review".
396
- - **upstream-blocked** — external dependency, Claude Code capability gap, or waiting on third-party fix. Truly terminal for this loop — no user question would change anything. Report the blocker (now via auto-invoke of `/wr-itil:report-upstream`, per ADR-024 2026-06-04 (P270) amendment) and move on. **Before skipping, run the manage-problem external-root-cause detection AFK fallback** (per P063 amended 2026-06-04): the fallback now invokes `/wr-itil:report-upstream` rather than only appending the marker. The report-upstream skill scores the drafted prose via `wr-risk-scorer:external-comms` (ADR-028); below-appetite branches send (public-issue Step 5 / comment Step 5c / security Step 6 per classification); above-appetite branches risk-reduce + re-score; if-still-above queue an `outstanding_questions` entry per P352 queue-and-continue (orchestrator does NOT halt). Existing tickets carrying `- **Upstream report pending** —` or `- **Reported Upstream:**` / a `## Reported Upstream` section are detected via the already-noted check; the marker shape is retained for backward compatibility and as the detection substrate. The outbound audit trail across AFK iterations now reflects ACTUAL filings (or queued-for-review drafts), not just deferred intents.
396
+ - **upstream-blocked** — external dependency, Claude Code capability gap, or waiting on third-party fix. Truly terminal for this loop — no user question would change anything. Report the blocker (now via auto-invoke of `/wr-itil:report-upstream`, per ADR-024 2026-06-04 (P270) amendment) and move on. **Before skipping, run the manage-problem external-root-cause detection AFK fallback** (per P063 amended 2026-06-04): the fallback now invokes `/wr-itil:report-upstream` rather than only appending the marker. The report-upstream skill scores the drafted prose via `wr-risk-scorer:external-comms` (ADR-028); below-appetite branches send (public-issue Step 5 / comment Step 5c / security Step 6 per classification); above-appetite branches risk-reduce + re-score; if-still-above queue an `outstanding_questions` entry per P352 queue-and-continue (orchestrator does NOT halt). Existing tickets carrying `- **Upstream report pending** --` (canonical ASCII per P210), `- **Upstream report pending** —` (legacy em-dash), or `- **Reported Upstream:**` / a `## Reported Upstream` section are detected via the already-noted check; the marker shape is retained for backward compatibility and as the detection substrate. The outbound audit trail across AFK iterations now reflects ACTUAL filings (or queued-for-review drafts), not just deferred intents.
397
397
 
398
398
  Record the category alongside the skip reason in the iteration report so Step 2.5 can read the categories deterministically.
399
399
 
@@ -509,10 +509,22 @@ rm -f "$ITER_JSON"
509
509
 
510
510
  **Iteration prompt body (self-contained — the subprocess has no prior conversation context):**
511
511
 
512
+ **Re-ground per iter (P211 — orchestrator-side construction invariant)**: each iter's prompt body MUST be re-grounded per iter against the CURRENT ticket's identity (ID + title) only. The orchestrator does NOT inline the target ticket's `## Fix Strategy` section verbatim into the dispatch prompt — the subprocess reads Fix Strategy from disk via `/wr-itil:manage-problem` inside its own context, where the design rationale travels with the ticket file and stays anchored to the correct ticket. Across iterations, no prior-iter content leaks into iter N's prompt body — specifically, prior ticket ID, prior Fix Strategy text, prior outcome reason, prior commit SHA, prior retro findings, and prior outstanding-question entries MUST NOT carry across the iter boundary into the new prompt. The construction is template-driven and reset per iter; no global accumulator carries from iter to iter. The "self-contained" opener above is a subprocess-side property (the subprocess has no prior conversation context); the re-grounding invariant is the symmetric orchestrator-side property (the orchestrator main turn does not carry prior-iter prompt content into the next iter's dispatch construction). P211 reported as inbound from downstream consumer bbstats as their P194 — without this invariant, an iter inherits a stale design-rationale frame and may land fixes anchored on the wrong ticket's intent, degrading the JTBD-006 audit trail. **`@jtbd JTBD-006`** (load-bearing).
513
+
512
514
  1. **Context**: this is one iteration of the AFK work-problems loop. The user is AFK. The orchestrator selected `P<NNN> (<title>)` as the highest-WSJF actionable ticket.
513
515
  2. **Task**: apply the `/wr-itil:manage-problem` workflow for `work highest WSJF problem that can be progressed non-interactively as the user is AFK`. Follow manage-problem SKILL.md verbatim, including architect / jtbd / style-guide / voice-tone gate reviews and the commit gate (manage-problem Step 11). Because this subprocess has the Agent tool in its own surface, the normal review-via-subagent paths work — no inline-verdict fallback needed.
514
516
  3. **Constraints**: commit the completed work per ADR-014. Do NOT push, do NOT run `push:watch`, do NOT run `release:watch` — the orchestrator's Step 6.5 owns release cadence. Do NOT invoke `capture-*` background skills mid-iter (AFK carve-out — ADR-032), **EXCEPT for retro-surfaced observations of recurring class-of-behaviour** — those route to `/wr-itil:capture-problem` per the **P342 mechanical-stage carve-out** (see retro-on-exit constraint #4 below; same trust-boundary as `/wr-retrospective:run-retro` Step 4a verification close-on-evidence — P342). Do NOT use `ScheduleWakeup` under any circumstance (P083 — iteration workers must not self-reschedule). **NEVER call `AskUserQuestion` mid-loop in AFK** (P135 / ADR-044): direction / deviation-approval / one-time-override / silent-framework observations queue at `ITERATION_SUMMARY.outstanding_questions` for loop-end batched presentation. **This includes the manage-problem substance-confirm-before-build guard (ADR-074 (Confirm a decision's substance before building dependent work)):** when the propose-fix step detects that the fix builds on a born-`proposed` decision whose substance is unconfirmed (via `wr-architect-is-decision-unconfirmed`), the iter does NOT implement on it and does NOT ask mid-loop — it queues a `category: "direction"` entry naming the unconfirmed ADR + its Decision Outcome for loop-end confirmation, and routes the ticket to `action: skipped`, `skip_reason_category: user-answerable`. Building on the unconfirmed substance instead (or guessing the choice) is the P315 failure this guard exists to prevent. The queued substance-confirm is a legitimate cat-1 direction ask — it is NOT counted as lazy in the Step 2d Ask Hygiene Pass (ADR-074 lazy-count exclusion). Per-iter `AskUserQuestion` calls are sub-contracting framework-resolved decisions back to the user (lazy deferral per Step 2d Ask Hygiene Pass classification). Non-interactive defaults apply per ADR-013 Rule 6 + ADR-044's framework-resolution boundary. **Treat the user as transient** (P130): even when observably present at orchestrator dispatch time, the user may answer one question and disappear for hours; presence is not a reliable signal and is not the goal. The iter's job is to progress the ticket and accumulate questions for batched surfacing — not to ask "is it OK to proceed?" at a mechanical-stage boundary. **Do NOT poll `bats` output with a bats-console-summary regex against TAP-format output** (P146 — bash until-loop-deadlock antipattern). The bats-console-summary line `<N> tests, <M> failures` is emitted ONLY by bats's *default* (non-TAP) formatter; `bats --tap` does not emit a console summary, so a polling loop of shape `until [ -f $OUT ] && grep -qE '^[0-9]+ tests?,' $OUT; do sleep 5; done` spins forever after bats completes (silent deadlock — no error, no exit; recovery requires manual SIGTERM with metadata loss per the P146/P147 stuck-before-emit subclass). When you need to wait on a backgrounded bats run, prefer `wait $bg_pid` (Unix idiom — completion signaled by process exit, no regex required) or, for the Bash tool, `run_in_background=true` + `BashOutput` polling on the tool's exit-state field rather than regex-poll on stdout. If you genuinely must regex-poll TAP output, anchor on the TAP plan line `^[0-9]+\.\.[0-9]+` (e.g. `1..1455`) — TAP's plan line is emitted on completion and is format-stable across bats versions; the bats-console-summary line is not. The console-summary vs TAP-format divergence is the load-bearing detail: `bats` and `bats --tap` produce structurally different stdout, and the antipattern assumes the former when iter dispatch typically uses the latter. **Do NOT poll subprocess completion with `pgrep -f '<pattern>'` inside an `until` / `while` loop** (P232 — self-referential pgrep deadlock; sibling variant of P146). `pgrep -f` matches against the FULL command line of every running process, so the polling loop's own `zsh -c` argument (which contains the literal `pgrep -f '<pattern>'` text) matches itself; with multiple concurrent polling loops, each loop matches the others and spins forever. Worked example of the antipattern: `until ! pgrep -f 'bats --recursive' > /dev/null 2>&1; do sleep 5; done` — the 2026-05-16 P232 deadlock witness; 4 concurrent polling loops each matched the others' command lines while no actual bats process ran; 45 min wall-clock + $20-30 wasted before manual SIGTERM. The same self-reference shape applies to `while pgrep -f ...; do sleep; done` and to `until ! pkill -0 -f '<pattern>'` / `while pkill -0 -f '<pattern>'` (signal-0 polling). The structural fix is the same as P146: prefer `wait $bg_pid` (Unix idiom — shell-native completion signal, no regex / no pgrep) or Bash-tool `run_in_background=true` + `BashOutput` polling (harness-tracked completion state). The hook `packages/itil/hooks/itil-bash-polling-antipattern-detect.sh` denies these shapes at PreToolUse:Bash, but the prompt rule belongs here too — structural enforcement + prompt discipline together close the class. **If the fix changes shippable code or package behaviour** (any path under `packages/<plugin>/{src,bin,hooks,skills,scripts,lib,agents}` excluding test paths — `test/`, `hooks/test/`, `scripts/test/` — and excluding `README.md` + `docs/*.md`), **the iter MUST author a `.changeset/*.md` entry in the same single ADR-014-grain commit as the fix** (the changeset names the bumping plugin via the YAML frontmatter `"@windyroad/<plugin>": <patch|minor|major>` per the changesets-action contract). **Doc-only changes** (under `docs/`, `*.md`) **and test-only changes** (under any `test/` path) **that ship no behaviour MAY omit the changeset**. The orchestrator's Step 6.5 release-cadence drain runs `release:watch` only when `.changeset/` is non-empty after push — without an iter-authored changeset, code-shape fixes accumulate without ever shipping to npm (violating JTBD-006's audit-trail expectation + JTBD-007's "Keep Plugins Current" closure dependency). Hook `packages/itil/hooks/itil-changeset-discipline.sh` (P141) provides hook-level enforcement at `git commit` time as defence-in-depth — but plugin hook execution depends on the marketplace cache carrying the current hook version, so the prompt-time constraint here MUST land independently (composes-with the hook; does NOT rely on the hook being installed). Inbound-reported from downstream consumer bbstats as their P195 — see [Related](#related) for `**Origin**: inbound-reported (bbstats#195)` per ADR-076. **`@jtbd JTBD-006`** (load-bearing) **`@jtbd JTBD-007`** (closure-dependent).
515
- 4. **Retro-on-exit (P086) + retro-surfaced observation classification (P342)**: before emitting `ITERATION_SUMMARY`, invoke `/wr-retrospective:run-retro`. Retro runs INSIDE this subprocess so its Step 2b pipeline-instability scan has access to the iteration's rich tool-call history (hook misbehaviour, repeat-workaround patterns, subagent-delegation friction, release-path instability). Retro may create tickets or update `docs/BRIEFING.md` run-retro commits its own work per ADR-014; any tickets it creates ride into either the iteration's own commit (if retro runs before the main commit) or a retro-owned follow-up commit, and the orchestrator picks them up on the next Step 1 scan. Proceed to `ITERATION_SUMMARY` emission regardless of retro findings — retro is non-blocking at the iter-subprocess layer (do not block on retro): if retro fails or surfaces findings, the iteration still returns a summary so the AFK loop does not silently halt on a flaky retro run. (Session-level retro at the orchestrator-main-turn layer per Step 2.4 gate (b) IS load-bearing — distinct surface; see Step 2.4 prose for the orchestrator-layer halt semantics.)
517
+ 4. **Retro-on-exit (P086) + retro-surfaced observation classification (P342) + iter-owned BRIEFING commit (P212)**: before emitting `ITERATION_SUMMARY`, invoke `/wr-retrospective:run-retro`. Retro runs INSIDE this subprocess so its Step 2b pipeline-instability scan has access to the iteration's rich tool-call history (hook misbehaviour, repeat-workaround patterns, subagent-delegation friction, release-path instability). Tickets retro creates ride a separate path: they delegate through `/wr-itil:manage-problem` which IS ADR-014 in-scope and self-commits each ticket per its own Step 11. Those commits land independently and the orchestrator picks them up on the next Step 1 scan.
518
+
519
+ **BRIEFING.md commit responsibility — iter owns, run-retro does not (P212).** run-retro is explicitly out-of-scope for self-commit per ADR-014's Scope section (which lists `packages/retrospective/skills/run-retro/SKILL.md` under "Out of scope for now"). Retro therefore EDITS but DOES NOT COMMIT `docs/BRIEFING.md` / `docs/briefing/*.md`. The iter subprocess (NOT run-retro, NOT the orchestrator main turn) owns the BRIEFING commit. After retro completes, run `git status --porcelain docs/BRIEFING.md docs/briefing/`. If non-empty, the iter:
520
+
521
+ 1. Stages the dirty BRIEFING paths (`git add docs/BRIEFING.md docs/briefing/`).
522
+ 2. Delegates to `wr-risk-scorer:pipeline` per ADR-014's `work → score → commit` ordering. The BRIEFING refresh is mechanical chore-class (derived retro output, no source-of-truth change) — within-appetite by construction, same risk shape as the `chore(problems): reconcile README ...` and `chore(problems): check upstream responses` precedents in ADR-014's commit-message convention table.
523
+ 3. Commits as `chore(briefing): refresh from iter retro (P<NNN>)` where `P<NNN>` is the ticket the iter was working.
524
+
525
+ Pre-P212, the orchestrator's Step 6.75 absorbed this as `dirty-for-a-known-reason` and added the commit at orchestrator-main-turn cost, invoking `wr-risk-scorer:pipeline` twice per iter (once for the ticket commit, once for the orchestrator-side hand-off). Shifting the commit into the iter subprocess preserves the audit trail (the same `chore(briefing)` commit lands), eliminates the orchestrator-main-turn hand-off, and moves the second scoring call from expensive main-turn context to cheaper iter-subprocess context. Step 6.75's table is amended below to classify dirty BRIEFING-at-iter-exit as a bug class rather than an expected hand-off.
526
+
527
+ Proceed to `ITERATION_SUMMARY` emission regardless of retro findings — retro is non-blocking at the iter-subprocess layer (do not block on retro): if retro fails or surfaces findings, the iteration still returns a summary so the AFK loop does not silently halt on a flaky retro run. The iter MUST verify `git status` is clean (no remaining BRIEFING dirty state) before emitting `ITERATION_SUMMARY`. (Session-level retro at the orchestrator-main-turn layer per Step 2.4 gate (b) IS load-bearing — distinct surface; see Step 2.4 prose for the orchestrator-layer halt semantics.)
516
528
 
517
529
  **P342 classification taxonomy — retro-surfaced observations.** When the iter-retro's Step 4b Stage 1 surfaces a ticketable observation, the routing depends on classification:
518
530
 
@@ -838,7 +850,7 @@ Before spawning the next iteration's subagent, verify the working tree state aga
838
850
  |---|---|---|
839
851
  | Clean (empty output) | The subagent committed successfully (the default happy path) | Proceed to Step 7 |
840
852
  | Dirty for a known reason | A deliberate hand-off to the next iteration (e.g. the subagent chose to skip the commit and report "uncommitted state" because risk was above appetite — per the Non-Interactive Decision Making table above). Reason MUST be stated in the iteration report. | Include the dirty state in the next iteration's subagent context and proceed to Step 7 |
841
- | Dirty for an unknown reason | Neither of the above — the subagent reported success but the tree is not clean, or the tree is dirty without a documented reason in the iteration report | **Halt the loop.** Report the `git status --porcelain` output, the last subagent's reported outcome, and the divergence. Do NOT spawn the next iteration. |
853
+ | Dirty for an unknown reason | Neither of the above — the subagent reported success but the tree is not clean, or the tree is dirty without a documented reason in the iteration report. **P212 case (no longer a hand-off)**: dirty `docs/BRIEFING.md` / `docs/briefing/*.md` at iter exit is a bug class — Step 5 retro-on-exit clause #4 now requires the iter to commit retro's BRIEFING edits as `chore(briefing): refresh from iter retro (P<NNN>)` before emitting `ITERATION_SUMMARY`. A dirty BRIEFING-at-iter-exit means the iter's retro-on-exit clause did not run to completion (retro hook failure, scoring failure, commit-gate rejection) and the orchestrator must NOT silently absorb it via a main-turn hand-off commit. | **Halt the loop.** Report the `git status --porcelain` output, the last subagent's reported outcome, and the divergence. Do NOT spawn the next iteration. |
842
854
 
843
855
  **Rationale**: the orchestrator previously treated the subagent's reported outcome as truth. Any lie, partial write, or silent failure in the subagent propagated into the summary. The `git status --porcelain` check is the cheapest possible independent verification — policy-authorised, no network, no judgement required — and it catches exactly the class of failure the subagent cannot self-report.
844
856
 
@@ -892,7 +904,7 @@ When `AskUserQuestion` is unavailable or the user is AFK, the skill (and the del
892
904
  | Halt-path final summary with accumulated user-answerable skips (CI failure / Rule 5 above-appetite / dirty-unknown / session-continuity / fetch failure) | Run Step 2.5b's surfacing routine before emitting the halt path's final AFK summary. Step 2.5b is gated on ≥1 accumulated user-answerable skip — empty-skip halts skip the routine. Step 2.5b surfaces *prior-iter accumulated user-answerable skips only*; it does NOT ask the user how to remediate the halt cause itself (CI failure / above-appetite state / dirty-unknown state remain halt-with-bug-signal). Per ADR-013 Rule 1 + ADR-032 + P126 (`halt-paths-must-route-design-questions-through-Step-2.5b`). |
893
905
  | Unexpected dirty state between iterations | Halt the loop. Report the `git status --porcelain` output, the last iteration's reported outcome, and the divergence — per P036 (Step 6.75). Run Step 2.5b before emitting the halt summary if ≥1 accumulated user-answerable skip from prior iters (P126). Do NOT attempt non-interactive recovery of the dirty state itself. |
894
906
  | Iter committed cleanly + claim contradicts on-disk ADR Confirmation state (P335) | Halt the loop with `outcome: halted-iter-over-claim`. Include the `wr-itil-verify-iter-summary` stdout (the `OVER-CLAIM: ADR-NNN has N unchecked Confirmation item(s)...` lines) as the divergence detail. Run Step 2.5b before emitting the halt summary if ≥1 accumulated user-answerable skip from prior iters. Do NOT auto-correct the iter's claim — the orchestrator cannot retroactively make a false claim true; the user adjudicates on return (re-dispatch / accept partial / amend). Per ADR-013 Rule 6 + ADR-032 subprocess-boundary trust contract + P335 (Step 6.75 verify-iter-claims sub-step). |
895
- | External root cause detected at Open → Known Error, or at park with `upstream-blocked` reason | **Auto-invoke `/wr-itil:report-upstream`** via the manage-problem Step 6 external-root-cause detection AFK fallback (per ADR-024 2026-06-04 (P270) amendment). The report-upstream skill composes the draft then scores the prose via `wr-risk-scorer:external-comms` (ADR-028); below-appetite → sends (public-issue Step 5 / comment Step 5c / security Step 6 per classification); above-appetite → risk-reduces (open-ended LLM judgement per leaf (a)) then re-scores → sends-or-queues to `## Queued Upstream Report` (leaf (c)). Security routing per leaf (b): upstream-with-`SECURITY.md` + below-appetite → files via declared channel; upstream-without-`SECURITY.md` → external-comms-gated impact assessment. Queue does NOT halt (P352). Tickets already carrying the stable `- **Upstream report pending** external dependency identified; invoke /wr-itil:report-upstream when ready` marker from prior sessions are detected via the already-noted grep check and routed to the report-upstream invocation; the marker shape is retained as the detection substrate. Per P063 (amended 2026-06-04) + P270 + ADR-013 Rule 6. |
907
+ | External root cause detected at Open → Known Error, or at park with `upstream-blocked` reason | **Auto-invoke `/wr-itil:report-upstream`** via the manage-problem Step 6 external-root-cause detection AFK fallback (per ADR-024 2026-06-04 (P270) amendment). The report-upstream skill composes the draft then scores the prose via `wr-risk-scorer:external-comms` (ADR-028); below-appetite → sends (public-issue Step 5 / comment Step 5c / security Step 6 per classification); above-appetite → risk-reduces (open-ended LLM judgement per leaf (a)) then re-scores → sends-or-queues to `## Queued Upstream Report` (leaf (c)). Security routing per leaf (b): upstream-with-`SECURITY.md` + below-appetite → files via declared channel; upstream-without-`SECURITY.md` → external-comms-gated impact assessment. Queue does NOT halt (P352). Tickets already carrying the stable `- **Upstream report pending** -- external dependency identified; invoke /wr-itil:report-upstream when ready` marker from prior sessions are detected via the already-noted grep check and routed to the report-upstream invocation; the marker shape is retained as the detection substrate (ASCII `--` per P210 — em-dash variant is the legacy form, still matched by the already-noted check for backward compatibility). Per P063 (amended 2026-06-04) + P270 + ADR-013 Rule 6. |
896
908
  | Mid-loop ask between iters in the orchestrator's main turn | Forbidden except at framework-prescribed user-interaction points (Step 0 session-continuity / fetch-failure halt; Step 2.5 / 2.5b loop-end emit; Step 6.5 above-appetite Rule 5 halt; Step 6.5 CI-failure / release:watch halt; Step 6.5 cohort-graduation halt-no-resolution halt; Step 6.5 cohort-graduation per-entry Rule 4 evidence-floor judgement (P308 — interactive only; AFK queues per P352); Step 6.75 dirty-for-unknown-reason halt). The loop's purpose is **progress + accumulation**; mechanical-stage transitions between iters are framework-resolved and MUST NOT prompt the user. Per ADR-044 framework-resolution boundary + ADR-013 Rule 1 (as amended by ADR-044) + P130. |
897
909
 
898
910
  ### Mid-loop ask discipline (orchestrator main turn) — P130
@@ -1023,6 +1035,7 @@ When every skipped ticket is in the `upstream-blocked` category (stop-condition
1023
1035
  - **ADR-022** (`docs/decisions/022-problem-verification-pending.proposed.md`) — iteration outcomes map into the return-summary's `outcome` field (`verifying` for a released fix, `known-error` for a root-cause-confirmed ticket awaiting release, etc.).
1024
1036
  - **ADR-032** (`docs/decisions/032-governance-skill-invocation-patterns.proposed.md`) — pattern taxonomy parent; Step 5 implements the AFK iteration-isolation wrapper — subprocess-boundary variant per the P084 amendment (2026-04-21), refining the P077 Agent-tool amendment. The P077 amendment remains in the ADR as the historical Agent-tool variant; the subprocess variant is the lead for new adopters.
1025
1037
  - **ADR-037** (`docs/decisions/037-skill-testing-strategy.proposed.md`) — doc-lint bats contract-assertion pattern used by `test/work-problems-step-5-delegation.bats`.
1038
+ - **P211** (`docs/problems/known-error/211-work-problems-orchestrator-carries-prior-ticket-fix-strategy-text-into-iter-dispatch-without-re-grounding.md`) — driver for Step 5 iteration-prompt-body's "Re-ground per iter" orchestrator-side construction invariant. The bug shape (reported as inbound from downstream consumer bbstats as their P194): the orchestrator builds each iter's dispatch prompt by reading the target ticket's `## Fix Strategy` section and citing it verbatim into the subprocess prompt; across iterations, prior-ticket Fix Strategy text leaks into subsequent dispatches without re-grounding in the new ticket's design intent, and iters land fixes anchored on the wrong design rationale. Fix: SKILL.md Step 5's "Iteration prompt body" section now carries an explicit re-grounding paragraph (immediately after the "self-contained" opener) that (a) names the per-iter re-ground invariant against current-ticket-ID + title only, (b) forbids inlining `## Fix Strategy` verbatim into the dispatch prompt (the subprocess reads it from disk via `/wr-itil:manage-problem`), (c) names the cross-iter leakage class (prior ticket ID, prior Fix Strategy text, prior outcome reason, prior commit SHA, prior retro findings, prior outstanding-questions), (d) names the construction shape (template-driven, reset per iter, no global accumulator). Behavioural second-source: `test/work-problems-step-5-prompt-body-re-grounding.bats` (structural-permitted per ADR-052 Surface 2; tdd-review comment in fixture cites P012 as harness-gap). Composes with P084 (subprocess-boundary isolation — re-grounding is the symmetric orchestrator-side property of the subprocess's "no prior conversation context"), ADR-032 (AFK iteration-isolation wrapper — re-grounding clarifies the wrapper's isolation intent on the orchestrator side), JTBD-006 (load-bearing — audit trail degrades if iters work the wrong ticket's design rationale).
1026
1039
  - **P206** (`docs/problems/known-error/206-work-problems-iter-workers-dont-add-changesets-fix-commits-accumulate-without-release.md`) — driver for Step 5 iter-prompt-body's explicit "if the fix changes shippable code, author a `.changeset/*.md` in the same commit" constraint (composes defence-in-depth with hook P141's `git commit`-time enforcement). Inbound-reported by downstream consumer **bbstats** as their P195 (`**Origin**: inbound-reported (bbstats#195)` per ADR-076 sort tier). Behavioural second-source: `test/work-problems-step-5-iter-changeset-required.bats` (structural-permitted per ADR-052; tdd-review comment in fixture).
1027
1040
  - **P141** (`docs/problems/verifying/141-iter-prompt-time-reminder-misses-40-percent-of-publishable-iters-hook-level-enforcement.md`) — sibling hook (`packages/itil/hooks/itil-changeset-discipline.sh`) that enforces the changeset-discipline rule at `git commit` time. The Step 5 iter-prompt-body constraint composes-with this hook; the prompt-time rule is load-bearing because plugin-hook execution depends on the marketplace cache carrying the current hook version (a fresh-cache adopter without P141 still gets the constraint via the prompt).
1028
1041
  - **JTBD-001**, **JTBD-006**, **JTBD-007**, **JTBD-101**, **JTBD-201** — personas whose reliability expectations the iteration-isolation wrapper restores. JTBD-006 (Progress the Backlog While I'm Away) + JTBD-007 (Keep Plugins Current Across Projects) are the load-bearing pair for the P206 changeset-discipline constraint — JTBD-006 requires the audit trail to stay accurate at release boundary; JTBD-007's closure depends on fixes actually shipping to npm.
@@ -0,0 +1,128 @@
1
+ #!/usr/bin/env bats
2
+ # P211 — work-problems Step 5 iteration-prompt-body must EXPLICITLY re-ground
3
+ # each iter's dispatch prompt against the CURRENT ticket only. The orchestrator
4
+ # MUST NOT inline the ticket's `## Fix Strategy` text verbatim, and MUST NOT
5
+ # leak prior-iter content (prior ticket ID, prior Fix Strategy text, prior
6
+ # outcome reason, prior commit SHA, prior retro findings) across iterations.
7
+ #
8
+ # Reported as inbound from downstream consumer bbstats (their P194) on
9
+ # 2026-05-15; covered by ADR-076 Origin field tier.
10
+ #
11
+ # Behavioural mechanism for the bug: AFK iter subprocesses inherit a stale
12
+ # design-rationale frame and may attempt fixes anchored on the wrong ticket's
13
+ # intent. Workaround the ticket names: user-in-the-loop verification after
14
+ # each iter, reading the subprocess's commit and checking whether it cites
15
+ # the correct ticket's design rationale — a manual-policing burden the AFK
16
+ # loop is meant to eliminate. JTBD-006 (Progress the Backlog While I'm Away)
17
+ # is load-bearing: the audit trail and trust in the AFK loop degrade if iters
18
+ # work the wrong ticket's design rationale.
19
+ #
20
+ # tdd-review: structural-permitted (justification: SKILL.md is the named
21
+ # contract document under ADR-052; behavioural alternative would require a
22
+ # synthetic `claude -p` iter dispatch harness that simulates multiple
23
+ # sequential iters and asserts no cross-iter prompt-body content leakage —
24
+ # that harness sits outside the skill layer and depends on the Anthropic CLI
25
+ # binary. Same Permitted Exception precedent as
26
+ # `work-problems-step-5-iter-changeset-required.bats:14-21`,
27
+ # `work-problems-step-5-delegation.bats:99-105`, and the P083 / P086 / P089
28
+ # ScheduleWakeup / retro / stdin-redirect fixtures in the same directory.
29
+ # P012 is the harness-gap ticket).
30
+ #
31
+ # @problem P211
32
+ # @problem P012
33
+ # @jtbd JTBD-006
34
+ # @jtbd JTBD-001
35
+ #
36
+ # Cross-reference:
37
+ # P211 — this ticket (orchestrator carries prior-ticket Fix Strategy into
38
+ # next iter's dispatch context — pollutes the new iter's framing)
39
+ # bbstats#194 — inbound report from downstream consumer
40
+ # ADR-014 (single-commit grain — fix lands as one coherent commit)
41
+ # ADR-032 (governance skill invocation patterns — AFK iteration-isolation
42
+ # wrapper; re-grounding is a clarification of that isolation intent)
43
+ # ADR-052 (behavioural tests default; structural-permitted with comment)
44
+ # ADR-076 (inbound-reported problems rank ahead via sort tier — Origin
45
+ # field stamping)
46
+ # JTBD-006 (Progress the Backlog While I'm Away) — load-bearing
47
+
48
+ setup() {
49
+ SKILL_DIR="$(cd "$(dirname "$BATS_TEST_FILENAME")/.." && pwd)"
50
+ SKILL_FILE="${SKILL_DIR}/SKILL.md"
51
+ }
52
+
53
+ @test "SKILL.md cites P211 (re-grounding driver) in Related section" {
54
+ # Self-documenting contract — a future contributor weakening the
55
+ # re-grounding constraint reads P211 and understands why it exists.
56
+ run grep -nE 'P211' "$SKILL_FILE"
57
+ [ "$status" -eq 0 ]
58
+ }
59
+
60
+ @test "SKILL.md Step 5 iteration prompt body names re-grounding per iter explicitly" {
61
+ # The "self-contained" opener at line 510 is the existing weaker form; the
62
+ # stricter "re-ground per iter" phrasing names the construction invariant
63
+ # the orchestrator MUST satisfy on each iter dispatch. P211's bug shape is
64
+ # exactly the case where "self-contained" was read as a subprocess-side
65
+ # property only, with the orchestrator-side construction leaking prior-iter
66
+ # content into the new iter's prompt body.
67
+ run grep -niE "re.?ground.{0,40}per iter|re.?grounded.{0,40}per iter|per.?iter.{0,40}re.?ground" "$SKILL_FILE"
68
+ [ "$status" -eq 0 ]
69
+ }
70
+
71
+ @test "SKILL.md Step 5 iter prompt body forbids inlining Fix Strategy verbatim" {
72
+ # The bug shape: orchestrator reads target ticket's `## Fix Strategy` and
73
+ # cites it verbatim into the iteration subprocess's prompt body. The
74
+ # SKILL.md MUST explicitly forbid this so future contributors understand
75
+ # the subprocess reads Fix Strategy from disk via manage-problem inside
76
+ # its own context.
77
+ run grep -niE "(not|never|MUST NOT|does not).{0,40}inline.{0,40}Fix Strategy|Fix Strategy.{0,40}(not|never|MUST NOT|does not).{0,40}inline|do not.{0,40}cite.{0,40}Fix Strategy.{0,40}verbatim|Fix Strategy.{0,40}verbatim.{0,40}(not|never|forbid)" "$SKILL_FILE"
78
+ [ "$status" -eq 0 ]
79
+ }
80
+
81
+ @test "SKILL.md Step 5 iter prompt body explicitly forbids prior-iter content leakage" {
82
+ # The cross-iter leakage class names: prior ticket ID, prior Fix Strategy
83
+ # text, prior outcome reason, prior commit SHA, prior retro findings. The
84
+ # SKILL.md MUST name the no-leakage invariant explicitly so the orchestrator
85
+ # main turn's prompt construction is constrained on every iter.
86
+ run grep -niE "(no prior|not.{0,20}prior|prior.?iter.{0,40}(leak|carry|inherit)|leak.{0,40}prior|carry.{0,40}prior.{0,40}iter)" "$SKILL_FILE"
87
+ [ "$status" -eq 0 ]
88
+ }
89
+
90
+ @test "SKILL.md Step 5 names template-driven reset-per-iter construction" {
91
+ # The construction shape: template-driven, reset per iter, no global
92
+ # accumulator across iters. This is the structural invariant the
93
+ # orchestrator main turn must satisfy when building each iter's prompt.
94
+ run grep -niE "template.?driven|reset per iter|reset.{0,20}per.{0,20}iter|no.{0,20}(global )?accumulator" "$SKILL_FILE"
95
+ [ "$status" -eq 0 ]
96
+ }
97
+
98
+ @test "SKILL.md Step 5 iteration prompt body cites P211 inline" {
99
+ # The re-grounding clause must cite P211 inline so the contract document
100
+ # is self-documenting — a future contributor removing the clause reads the
101
+ # P211 reference and understands why it exists before deleting it. Same
102
+ # pattern as the P083 / P086 / P146 / P232 inline citations in the same
103
+ # block.
104
+ run grep -nE "re.?ground.{0,200}P211|P211.{0,200}re.?ground|P211.{0,200}Fix Strategy|Fix Strategy.{0,200}P211" "$SKILL_FILE"
105
+ [ "$status" -eq 0 ]
106
+ }
107
+
108
+ @test "SKILL.md re-grounding clause sits inside Step 5 iteration prompt body section" {
109
+ # Structural locality: the re-grounding clause must live INSIDE Step 5's
110
+ # Iteration prompt body section (after the "self-contained" opener at
111
+ # line 510), not free-floating elsewhere in SKILL.md. Locality matters
112
+ # because the rule is read alongside the rest of the prompt-body contract,
113
+ # and a future contributor refactoring Step 5 must encounter it inline.
114
+ # Assertion shape: the line containing "re-ground" sits after the line
115
+ # containing "Iteration prompt body" and before the line containing
116
+ # "Return-summary contract".
117
+ iter_line=$(grep -nE '^\*\*Iteration prompt body' "$SKILL_FILE" | head -1 | cut -d: -f1)
118
+ # Tightened regex: require the literal hyphenated form "re-ground" /
119
+ # "re-grounded" / "re-grounding" so partial-substring matches like
120
+ # "foreground" (line 33) don't satisfy the assertion.
121
+ reground_line=$(grep -niE "re-ground(ed|ing)?" "$SKILL_FILE" | head -1 | cut -d: -f1)
122
+ return_summary_line=$(grep -nE '^\*\*Return-summary contract' "$SKILL_FILE" | head -1 | cut -d: -f1)
123
+ [ -n "$iter_line" ]
124
+ [ -n "$reground_line" ]
125
+ [ -n "$return_summary_line" ]
126
+ [ "$reground_line" -gt "$iter_line" ]
127
+ [ "$reground_line" -lt "$return_summary_line" ]
128
+ }