npm - @windyroad/itil - Versions diffs - 0.21.0 → 0.21.1-preview.217 - Mend

@windyroad/itil 0.21.0 → 0.21.1-preview.217

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (4) hide show

package/.claude-plugin/plugin.json +1 -1
package/package.json +1 -1
package/skills/work-problems/SKILL.md +32 -11
package/skills/work-problems/test/work-problems-step-2-5b-cross-halt-routing.bats +149 -0

package/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,5 +1,5 @@
 {
   "name": "wr-itil",
-  "version": "0.21.0",
+  "version": "0.21.1",
   "description": "ITIL-aligned IT service management for Claude Code"
 }

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@windyroad/itil",
-  "version": "0.21.0",
+  "version": "0.21.1-preview.217",
   "description": "ITIL-aligned IT service management for Claude Code (problem, and future incident/change skills)",
   "bin": {
     "windyroad-itil": "./bin/install.mjs"

package/skills/work-problems/SKILL.md CHANGED Viewed

@@ -75,6 +75,10 @@ After the fetch/divergence check, Step 0 MUST run a session-continuity detection
 - **Interactive** (`AskUserQuestion` is available AND the loop was not started in AFK mode): prompt the user with the Prior-Session State report and four options — **Resume the prior work** (land the drafted files as iter 1), **Discard the draft** and restart from scratch, **Leave-and-lower-priority** (skip the dirty paths and work the next backlog item that doesn't touch them), **Halt the loop** (too much dirty state to proceed non-interactively). Route the chosen branch before opening Step 1.
 - **Non-interactive / AFK** (default for this skill per JTBD-006): do NOT call `AskUserQuestion`. Halt the loop with the structured Prior-Session State report in the AFK summary. Per ADR-013 Rule 6 fail-safe: ambiguous session-continuity state requires user input; non-interactive recovery would mask the bug this check is meant to surface. This matches Step 6.75's "dirty for unknown reason → halt" stance at the Step 0 layer — the orchestrator does not silently proceed past partial work.
+**Step 2.5b cross-reference (P126)**: before emitting the final AFK summary for a Step 0 session-continuity halt, run Step 2.5b's surfacing routine. The routine is gated on ≥1 accumulated user-answerable skip; at Step 0 no iters have run yet so the gate is normally empty and Step 2.5b returns immediately, but the cross-reference is named here for contract uniformity — every halt path that emits a final summary routes through Step 2.5b regardless of whether the gating clause is empty in the typical case (`halt-paths-must-route-design-questions-through-Step-2.5b`).
+**Network failure halt (Step 0 fetch failure)**: if `git fetch origin` returns a network error, the loop halts and reports per the rule above. Before emitting the final AFK summary for a network-failure halt, run Step 2.5b's surfacing routine — same Step 2.5b cross-reference as the session-continuity halt. The gating clause is normally empty at Step 0 (no iters have run), but the cross-reference is named here for contract uniformity (`halt-paths-must-route-design-questions-through-Step-2.5b`).
 Step 6.75 treats a Step-0-resolved-with-user-confirmation state as `dirty-for-known-reason`: if the interactive branch's Resume option landed the drafted ADR as iter 1, the iter's commit clears the dirty state and the rest of the loop proceeds normally.
 #### README reconciliation preflight (per P118)
@@ -118,16 +122,9 @@ For stop-conditions #1 and #3 (no questions to ask), skip Step 2.5 and emit the
 The skipped tickets that triggered stop-condition #2 frequently carry **user-answerable design questions** (naming, direction, pacing, scope) whose answers would unblock the next AFK loop. The information the user needs to answer is fully known at stop time, so there is no cost to surfacing the questions before the terminal `ALL_DONE` emit.
-**1. Extract the question set.** For every skipped ticket whose classifier skip-reason is `user-answerable` (see Step 4's taxonomy), extract its outstanding question(s) from the ticket body — typically from a "Pacing decision", "Naming decision", or outstanding "Investigation Tasks" section. Cap at 4 questions per `AskUserQuestion` call per Anthropic's tool documentation.
-**2. Branch on interactivity per ADR-013 Rule 1 / Rule 6.**
+**1. Run the surfacing routine.** Step 2.5 calls Step 2.5b (the reusable surfacing routine defined below) with the accumulated user-answerable skip-reason set as input. Step 2.5b extracts the questions, branches on interactivity per ADR-013 Rule 1 / Rule 6, and either calls `AskUserQuestion` or emits the Outstanding Design Questions table. Stop-condition #2 is the canonical caller of Step 2.5b — by definition the stop fired *because* one or more remaining problems require interactive input, so the gating clause "≥1 user-answerable skip" is always satisfied here.
-- **Default branch — call `AskUserQuestion` when available** (the orchestrator's main turn is interactive by construction; the user is presumed at the keyboard). Batch the questions into one `AskUserQuestion` call (or more, if >4 questions, issued sequentially). Header: `"Outstanding design questions"`. For each question, set the prompt from the extracted text and the options from the ticket's candidate fixes or option list. Write each answer back to the corresponding ticket file so the next AFK loop does not re-ask. This is ADR-013 Rule 1 applied to the orchestrator's main-turn surface.
-- **Fallback branch — emit `### Outstanding Design Questions` table** when `AskUserQuestion` is unavailable (restricted permission mode, hook-disabled tool surface, or any other context where the structured-question primitive cannot fire). The table lists each question with its Ticket ID, the question text, and one-line context. The user answers on return. This is ADR-013 Rule 6 fail-safe — fall back to a structured summary when the structured-interaction primitive is unavailable.
-**Cross-skill principle (architect FLAG, P122)**: orchestrator main turns default to `AskUserQuestion` when available; the AFK persona (JTBD-006) is served by the **subprocess-boundary contract under ADR-032** (iteration subprocess workers are AFK by construction via `claude -p` — they exit at `ITERATION_SUMMARY` and never reach stop-condition #2), NOT by suppressing `AskUserQuestion` at the orchestrator layer. Step 5's iteration-prompt template carries the per-subprocess AFK contract (constraint: "Do not call `AskUserQuestion`"); stop-condition #2 fires only in the orchestrator's main turn where the user is presumed present. This principle generalises to any future AFK orchestrator that hits the same surface — defer the AFK persona to the subprocess boundary, not to the orchestrator's question-surfacing branch.
-**3. Emit the final summary + `ALL_DONE`.** The summary includes the Outstanding Design Questions table when any user-answerable questions were surfaced (see Output Format).
+**2. Emit the final summary + `ALL_DONE`.** The summary includes the Outstanding Design Questions table when any user-answerable questions were surfaced via Step 2.5b's fallback branch (see Output Format). When Step 2.5b's default branch fired (`AskUserQuestion` was available), the answers have already been written back to the corresponding ticket files and the table is omitted from the summary.
 ```
 ALL_DONE
@@ -135,6 +132,23 @@ ALL_DONE
 This sentinel line allows external scripts to detect completion.
+### Step 2.5b: Surface accumulated user-answerable skips (reusable surfacing routine, P122 + P126)
+Step 2.5b is the single source of truth for routing accumulated user-answerable skip-reasons through `AskUserQuestion`-when-available-else-table. It is the sub-step that Step 2.5 (stop-condition #2) AND every halt path that fires after iters have accumulated skipped tickets cross-references — keeping the surfacing logic in one place rather than duplicated across each halt path.
+**Gating clause — fire only when at least one accumulated user-answerable skip exists.** Iterate the skip list collected by Step 4's classifier and count entries whose `skip_reason_category == user-answerable`. If the count is zero, return immediately and let the caller emit its summary directly. This guards empty-skip halts (e.g. Step 0 fetch-failure halt before any iters have run) from triggering an unnecessary round-trip.
+**1. Extract the question set.** For every skipped ticket whose classifier skip-reason is `user-answerable` (see Step 4's taxonomy), extract its outstanding question(s) from the ticket body — typically from a "Pacing decision", "Naming decision", or outstanding "Investigation Tasks" section. Cap at 4 questions per `AskUserQuestion` call per Anthropic's tool documentation; the same cap applies regardless of whether Step 2.5b was invoked from stop-condition #2 or a halt path.
+**2. Branch on interactivity per ADR-013 Rule 1 / Rule 6.**
+- **Default branch — call `AskUserQuestion` when available** (the orchestrator's main turn is interactive by construction; the user is presumed at the keyboard). Batch the questions into one `AskUserQuestion` call (or more, if >4 questions, issued sequentially). Header: `"Outstanding design questions"`. For each question, set the prompt from the extracted text and the options from the ticket's candidate fixes or option list. Write each answer back to the corresponding ticket file so the next AFK loop does not re-ask. This is ADR-013 Rule 1 applied to the orchestrator's main-turn surface.
+- **Fallback branch — emit `### Outstanding Design Questions` table** when `AskUserQuestion` is unavailable (restricted permission mode, hook-disabled tool surface, or any other context where the structured-question primitive cannot fire). The table lists each question with its Ticket ID, the question text, and one-line context. The user answers on return. This is ADR-013 Rule 6 fail-safe — fall back to a structured summary when the structured-interaction primitive is unavailable.
+**Return.** Hand control back to the caller. The caller is responsible for emitting its own final summary (and the `ALL_DONE` sentinel for stop-condition #2; halt paths each have their own outcome label per Step 6 / ADR-042 Rule 5 / Step 6.75 / etc.).
+**Cross-skill principle (architect FLAG, P122 + P126)**: orchestrator main turns default to `AskUserQuestion` when available; the AFK persona (JTBD-006) is served by the **subprocess-boundary contract under ADR-032** (iteration subprocess workers are AFK by construction via `claude -p` — they exit at `ITERATION_SUMMARY` and never reach the orchestrator's stop or halt surfaces), NOT by suppressing `AskUserQuestion` at the orchestrator layer. Step 5's iteration-prompt template carries the per-subprocess AFK contract (constraint: "Do not call `AskUserQuestion`"); the orchestrator's stop and halt surfaces fire only in the main turn where the user is presumed present. P122 established this principle at Step 2.5; P126 extends it to every halt path that emits a final AFK summary (the principle: **halt-paths-must-route-design-questions-through-Step-2.5b** — every halt path that fires after iters have accumulated user-answerable skips MUST run Step 2.5b before emitting its summary).
 ### Step 3: Pick the highest-WSJF problem
 Select the problem with the highest WSJF score. If there's a tie, prefer:
@@ -361,7 +375,7 @@ After the iteration's commit lands but before starting the next iteration, check
 2. If `.changeset/` is non-empty after push, run `npm run release:watch` (merge the release PR + wait for npm publish).
 3. Resume the loop only after the release lands on npm.
-**Failure handling**: If `release:watch` fails (CI failure, publish failure), stop the loop and report the failure in the AFK summary. Do not retry non-interactively — the user must intervene.
+**Failure handling**: If `release:watch` fails (CI failure, publish failure), stop the loop and report the failure in the AFK summary. Do not retry non-interactively — the user must intervene. **Step 2.5b cross-reference (P126)**: before emitting the final AFK summary for a Failure handling / CI failure / release:watch halt, run Step 2.5b's surfacing routine. The routine is gated on ≥1 accumulated user-answerable skip; this halt path empirically frequently has accumulated skips from prior iters (the original P126 surface), so the gate is normally satisfied and Step 2.5b's AskUserQuestion-default branch fires (`halt-paths-must-route-design-questions-through-Step-2.5b`). The CI-failure cause itself remains a halt with bug-signal — Step 2.5b surfaces *prior-iter accumulated user-answerable skips only*; it does NOT ask the user how to remediate the CI failure (that requires the user to inspect the failing CI run on return).
 `push:watch` and `release:watch` are policy-authorised actions when residual risk is within appetite per RISK-POLICY.md, so no `AskUserQuestion` is required for the drain itself (ADR-013 Rule 5).
@@ -398,6 +412,8 @@ After the iteration's commit lands but before starting the next iteration, check
 - Any Verification Pending ticket IDs implicated per Rule 2b
 - A one-line scorer-gap note (e.g., "scorer produced only `move-to-holding` remediations; residual still ≥ 5/25 after exhaustion — extend scorer vocabulary per P108")
+**Step 2.5b cross-reference (P126)**: before emitting the Rule 5 halt iteration summary, run Step 2.5b's surfacing routine. The routine is gated on ≥1 accumulated user-answerable skip; Rule 5 halts that fire late in a long AFK loop frequently have accumulated skips from prior iters, so Step 2.5b's AskUserQuestion-default branch typically fires (`halt-paths-must-route-design-questions-through-Step-2.5b`). **Critical guard (architect FLAG)**: Step 2.5b surfaces *prior-iter accumulated user-answerable skips only* — it does NOT ask the user how to remediate the above-appetite state itself; the halt-causing scorer-gap remains a halt-with-bug-signal per ADR-042 Rule 5 invariant ("never release above appetite", scorer is the decision surface, not the user). Surfacing prior-iter skips does not retry the above-appetite remediation, does not bypass the never-release-above-appetite invariant, and does not convert the halt into a non-halt — it just takes the existing prior-iter user-input round-trip with it.
 Halt is a **bug signal** — the scorer should always have progressively more aggressive remediations available once P108 lands. Until then, exhaustion is expected when the only path to within-appetite requires a non-`move-to-holding` class.
 **Audit trail (ADR-042 Rule 6):** append one line per auto-apply to the iteration summary's Auto-apply trail subsection, including remediation ID, action class, pre/post scores, action taken, and description citation. For `move-to-holding` actions, also append to `docs/changesets-holding/README.md` "Currently held".
@@ -419,6 +435,8 @@ Before spawning the next iteration's subagent, verify the working tree state aga
 **Rationale**: the orchestrator previously treated the subagent's reported outcome as truth. Any lie, partial write, or silent failure in the subagent propagated into the summary. The `git status --porcelain` check is the cheapest possible independent verification — policy-authorised, no network, no judgement required — and it catches exactly the class of failure the subagent cannot self-report.
+**Step 2.5b cross-reference (P126)**: before emitting the final AFK summary for a Step 6.75 dirty-for-unknown-reason halt, run Step 2.5b's surfacing routine. The routine is gated on ≥1 accumulated user-answerable skip; Step 6.75 halts fire between iters and frequently have accumulated skips from prior iters, so Step 2.5b's AskUserQuestion-default branch typically fires (`halt-paths-must-route-design-questions-through-Step-2.5b`). The dirty-for-unknown-reason halt itself remains a halt with bug-signal — Step 2.5b surfaces *prior-iter accumulated user-answerable skips only*; it does NOT ask the user how to recover the dirty state (that remains a Rule 6 user-input requirement on return).
 **Out of scope for this step**: attempting recovery from an unknown-reason dirty state. Per ADR-013 Rule 6, conflict resolution and ambiguous state require user input; non-interactive recovery would mask the bug this check is meant to surface.
 ### Step 7: Loop
@@ -444,7 +462,8 @@ When `AskUserQuestion` is unavailable or the user is AFK, the skill (and the del
 | Prior-session partial work detected at start (session-continuity dirty: untracked `docs/decisions/*.proposed.md` / `docs/problems/*.md`, `.afk-run-state/iter-*.json` with `is_error: true` or `api_error_status >= 400`, stale `.claude/worktrees/*`, uncommitted SKILL.md/source/ADR edits) | Halt the loop with a structured Prior-Session State report in the AFK summary. Do NOT attempt non-interactive resume. Interactive invocations prompt via `AskUserQuestion` with 4 options (resume / discard / leave-and-lower-priority / halt). Per P109 + ADR-013 Rule 6 (Step 0 session-continuity detection pass). |
 | Fix verification needed | Skip problem, add to "needs verification" list |
 | Stop-condition #2 with user-answerable skip-reasons | Default: call AskUserQuestion (batched, ≤4 per call, sequential when >4) — the orchestrator's main turn is interactive by construction per ADR-032 subprocess-boundary; user is presumed at the keyboard. Fallback: emit Outstanding Design Questions table when AskUserQuestion is unavailable (Rule 6 fail-safe). Per ADR-013 Rule 1 + P122 (Step 2.5). |
-| Unexpected dirty state between iterations | Halt the loop. Report the `git status --porcelain` output, the last iteration's reported outcome, and the divergence — per P036 (Step 6.75). Do NOT attempt non-interactive recovery. |
+| Halt-path final summary with accumulated user-answerable skips (CI failure / Rule 5 above-appetite / dirty-unknown / session-continuity / fetch failure) | Run Step 2.5b's surfacing routine before emitting the halt path's final AFK summary. Step 2.5b is gated on ≥1 accumulated user-answerable skip — empty-skip halts skip the routine. Step 2.5b surfaces *prior-iter accumulated user-answerable skips only*; it does NOT ask the user how to remediate the halt cause itself (CI failure / above-appetite state / dirty-unknown state remain halt-with-bug-signal). Per ADR-013 Rule 1 + ADR-032 + P126 (`halt-paths-must-route-design-questions-through-Step-2.5b`). |
+| Unexpected dirty state between iterations | Halt the loop. Report the `git status --porcelain` output, the last iteration's reported outcome, and the divergence — per P036 (Step 6.75). Run Step 2.5b before emitting the halt summary if ≥1 accumulated user-answerable skip from prior iters (P126). Do NOT attempt non-interactive recovery of the dirty state itself. |
 | External root cause detected at Open → Known Error, or at park with `upstream-blocked` reason | Append the stable `- **Upstream report pending** — external dependency identified; invoke /wr-itil:report-upstream when ready` marker to the ticket's `## Related` section; do NOT auto-invoke `/wr-itil:report-upstream` (Step 6 security-path branch is interactive — per ADR-024 Consequences). Use the already-noted grep check to avoid duplicate lines. Per P063 + ADR-013 Rule 6. |
 ## Edge Cases
@@ -526,6 +545,8 @@ When every skipped ticket is in the `upstream-blocked` category (stop-condition
 - **P109** — session-continuity detection pass added to Step 0 after the fetch/divergence check. Enumerates five signals (untracked `docs/decisions/*.proposed.md`, untracked `docs/problems/*.md`, `.afk-run-state/iter-*.json` error markers, stale `.claude/worktrees/*` dirs, uncommitted SKILL.md/source/ADR edits). Routes interactive via `AskUserQuestion` with 4 options, AFK via halt-with-report per ADR-013 Rule 6.
 - **P041** — release-cadence drain (Step 6.5); remains in the orchestrator's main turn.
 - **P053** — Outstanding Design Questions surfacing at stop-condition #2 (Step 2.5); fed by the iteration subagent's `outstanding_questions` field.
+- **P122** (`docs/problems/122-work-problems-stop-condition-2-defaults-to-afk-table-instead-of-asking-interactively.verifying.md`) — established the AskUserQuestion-default-when-available routing at Step 2.5. The routing prose (default branch, Rule 6 fallback, cross-skill principle, user-answerable scoping) was originally landed under Step 2.5; P126 moved it into the reusable Step 2.5b sub-step.
+- **P126** (`docs/problems/126-work-problems-failure-handling-halt-bypasses-step-2-5-routing.known-error.md`) — extended the principle to every halt path that emits a final AFK summary. Step 2.5b is the single source of truth that Step 2.5, Step 0 (session-continuity + fetch-failure), Step 6.5 (Failure handling + Rule 5 above-appetite), and Step 6.75 (dirty-for-unknown-reason) all cross-reference. The principle: `halt-paths-must-route-design-questions-through-Step-2.5b`. Behavioural second-source: `test/work-problems-step-2-5b-cross-halt-routing.bats`.
 - **ADR-013** (`docs/decisions/013-structured-user-interaction-for-governance-decisions.proposed.md`) — Rule 6 non-interactive fail-safe applies to every iteration-subagent decision surface.
 - **ADR-014** (`docs/decisions/014-governance-skills-commit-their-own-work.proposed.md`) — preserved under the iteration subagent; the subagent commits its own work.
 - **ADR-015** (`docs/decisions/015-on-demand-assessment-skills.proposed.md`) — Agent-tool-vs-Skill-tool delegation precedent (Step 6.5's wording mirror).

package/skills/work-problems/test/work-problems-step-2-5b-cross-halt-routing.bats ADDED Viewed

@@ -0,0 +1,149 @@
+#!/usr/bin/env bats
+# P126: work-problems halt paths must route accumulated user-answerable
+# skips through Step 2.5's surfacing routine before emitting the AFK
+# summary. P122 fixed the routing at Step 2.5 stop-condition #2; P126
+# extends the same contract to the remaining halt paths (Step 0
+# session-continuity, Step 0 fetch failure, Step 6.5 CI-failure, Step
+# 6.5 ADR-042 Rule 5, Step 6.75 dirty-for-unknown-reason).
+#
+# The fix shape: extract Step 2.5's surfacing routine into a reusable
+# named sub-step (Step 2.5b) that every halt path cross-references
+# before emitting its summary. Empty-skip halts skip the round-trip
+# (gating clause: ≥1 user-answerable skip accumulated).
+#
+# Doc-lint contract assertions per ADR-037 Permitted Exception
+# (structural checks on prose contract, not behavioural coverage).
+#
+# @problem P126
+# @jtbd JTBD-001 (Enforce Governance Without Slowing Down)
+# @jtbd JTBD-006 (Progress the Backlog While I'm Away)
+setup() {
+  REPO_ROOT="$(cd "$(dirname "$BATS_TEST_FILENAME")/../../../../.." && pwd)"
+  SKILL_MD="$REPO_ROOT/packages/itil/skills/work-problems/SKILL.md"
+  BRIEFING_MD="$REPO_ROOT/docs/briefing/afk-subprocess.md"
+}
+@test "work-problems P126: SKILL.md exists" {
+  [ -f "$SKILL_MD" ]
+}
+@test "work-problems P126: SKILL.md names Step 2.5b as the reusable surfacing routine" {
+  # The fix extracts Step 2.5's AskUserQuestion-when-available-else-table
+  # logic into a named reusable sub-step that halt paths can cross-reference
+  # uniformly. The name MUST appear as a heading or anchor so cross-references
+  # resolve to a single source of truth.
+  run grep -F 'Step 2.5b' "$SKILL_MD"
+  [ "$status" -eq 0 ]
+}
+@test "work-problems P126: Step 2.5b heading exists in SKILL.md" {
+  # Stronger structural check: Step 2.5b must be a markdown heading (### or
+  # ####) so it is a navigable anchor, not just an inline mention.
+  run grep -E '^#{3,4} Step 2\.5b' "$SKILL_MD"
+  [ "$status" -eq 0 ]
+}
+@test "work-problems P126: Step 2.5b preserves the AskUserQuestion default branch" {
+  # Step 2.5b must inherit P122's interactive-default routing — calling
+  # AskUserQuestion when available is the load-bearing fix.
+  run grep -F 'AskUserQuestion' "$SKILL_MD"
+  [ "$status" -eq 0 ]
+}
+@test "work-problems P126: Step 2.5b preserves the table fallback for ADR-013 Rule 6" {
+  # Rule 6 fail-safe: when AskUserQuestion is unavailable, emit the
+  # Outstanding Design Questions table.
+  run grep -F 'Outstanding Design Questions' "$SKILL_MD"
+  [ "$status" -eq 0 ]
+}
+@test "work-problems P126: Step 2.5b is gated on at least one accumulated user-answerable skip" {
+  # The gating clause prevents empty-skip halts from triggering an
+  # unnecessary round-trip. Architect-flagged refinement: the gate clause
+  # must be named in Step 2.5b so future authors don't copy-paste cross-
+  # references and forget the gate.
+  run grep -nE 'at least one accumulated user-answerable skip|≥ ?1 (accumulated )?user-answerable skip|>= ?1 user-answerable skip|one or more user-answerable skip' "$SKILL_MD"
+  [ "$status" -eq 0 ]
+}
+@test "work-problems P126: Step 0 session-continuity halt cross-references Step 2.5b" {
+  # P109's Step 0 AFK fallback halts with a Prior-Session State report.
+  # When iters have accumulated user-answerable skips before the halt fires
+  # (rare at Step 0 since iters haven't run yet, but the contract must be
+  # uniform), the halt must run Step 2.5b first.
+  run grep -nE 'Step 0.*Step 2\.5b|session-continuity.*Step 2\.5b' "$SKILL_MD"
+  [ "$status" -eq 0 ]
+}
+@test "work-problems P126: Step 0 fetch failure halt cross-references Step 2.5b" {
+  # Step 0's git fetch network failure halt also emits a final summary
+  # without iter context — the cross-reference is for contract uniformity
+  # even when no skips can accumulate at Step 0.
+  run grep -nE 'fetch.*Step 2\.5b|Network failure.*Step 2\.5b|fetch failure.*Step 2\.5b' "$SKILL_MD"
+  [ "$status" -eq 0 ]
+}
+@test "work-problems P126: Step 6.5 CI-failure halt cross-references Step 6.5b" {
+  # The Step 6.5 "Failure handling" clause halts on push:watch / release:watch
+  # failure. After N iters this halt path frequently has accumulated user-
+  # answerable skips (the empirically observed P126 surface).
+  run grep -nE 'Failure handling.*Step 2\.5b|CI failure.*Step 2\.5b|release:watch.*Step 2\.5b' "$SKILL_MD"
+  [ "$status" -eq 0 ]
+}
+@test "work-problems P126: Step 6.5 ADR-042 Rule 5 halt cross-references Step 2.5b with halt-vs-prior-skip guard" {
+  # Architect-flagged refinement: the cross-reference under Rule 5 halt
+  # must explicitly distinguish "Step 2.5b surfaces prior-iter accumulated
+  # skips" from "ADR-042 Rule 5 halt remains the bug signal — the user is
+  # NOT asked how to remediate the above-appetite state". Two grep checks:
+  # cross-reference present AND the guard prose present nearby.
+  run grep -nE 'Rule 5.*Step 2\.5b|halted-above-appetite.*Step 2\.5b' "$SKILL_MD"
+  [ "$status" -eq 0 ]
+  # Guard: prior-iter accumulated skips are surfaced; the halt-causing
+  # scorer-gap remains a halt with bug-signal.
+  run grep -nE 'prior[- ]iter accumulated|surfaces? prior[- ]iter|NOT ask the user how to remediate|halt-causing scorer[- ]gap' "$SKILL_MD"
+  [ "$status" -eq 0 ]
+}
+@test "work-problems P126: Step 6.75 dirty-for-unknown-reason halt cross-references Step 2.5b" {
+  # P036's Step 6.75 inter-iter verification halt emits a final summary
+  # when git status is dirty for unknown reason between iters. After N
+  # iters this halt path is the second most common P126 surface
+  # empirically.
+  run grep -nE 'Dirty for an unknown reason.*Step 2\.5b|dirty[- ]for[- ]unknown[- ]reason.*Step 2\.5b|Step 6\.75.*Step 2\.5b' "$SKILL_MD"
+  [ "$status" -eq 0 ]
+}
+@test "work-problems P126: Step 2.5 still cross-references Step 2.5b (single source of truth)" {
+  # The original Step 2.5 stop-condition #2 branch must also call into
+  # Step 2.5b — keeping the surfacing logic in one place rather than
+  # duplicated between Step 2.5 and Step 2.5b.
+  run grep -nE 'Step 2\.5 .*Step 2\.5b|Step 2\.5b.*surfacing|Step 2\.5\b.*calls? Step 2\.5b' "$SKILL_MD"
+  [ "$status" -eq 0 ]
+}
+@test "work-problems P126: Decisions Table row exists for halt-paths-cross-route via Step 2.5b" {
+  # The "Non-Interactive Decision Making" Decisions Table at the bottom
+  # of SKILL.md must carry a row that names the cross-halt routing so the
+  # decision summary is consistent with the Step prose.
+  run grep -nE '\| Halt[- ]path .*Step 2\.5b|halt path.*accumulated user-answerable' "$SKILL_MD"
+  [ "$status" -eq 0 ]
+}
+@test "work-problems P126: Briefing entry documents the halt-paths-must-route principle" {
+  # The cross-session briefing entry must record the principle alongside
+  # the existing P122 entry so future sessions inherit the reasoning.
+  [ -f "$BRIEFING_MD" ]
+  run grep -nE 'P126|halt[- ]paths[- ]must[- ]route|halt-paths-must-route-design-questions-through-Step-2\.5' "$BRIEFING_MD"
+  [ "$status" -eq 0 ]
+}
+@test "work-problems P126: Briefing entry cross-references P122" {
+  # Architect-flagged refinement: the briefing entry must cite both
+  # P122 (parent) and P126 (extension) so the principle's evolution is
+  # traceable.
+  run grep -nE 'P122' "$BRIEFING_MD"
+  [ "$status" -eq 0 ]
+}