npm - @windyroad/itil - Versions diffs - 0.23.4 → 0.23.5-preview.263 - Mend

@windyroad/itil 0.23.4 → 0.23.5-preview.263

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (4) hide show

package/.claude-plugin/plugin.json +1 -1
package/package.json +1 -1
package/skills/work-problems/SKILL.md +2 -1
package/skills/work-problems/test/work-problems-step-5-bats-polling-discipline.bats +91 -0

package/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,5 +1,5 @@
 {
   "name": "wr-itil",
-  "version": "0.23.4",
+  "version": "0.23.5",
   "description": "ITIL-aligned IT service management for Claude Code"
 }

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@windyroad/itil",
-  "version": "0.23.4",
+  "version": "0.23.5-preview.263",
   "description": "ITIL-aligned IT service management for Claude Code (problem, and future incident/change skills)",
   "bin": {
     "windyroad-itil": "./bin/install.mjs"

package/skills/work-problems/SKILL.md CHANGED Viewed

@@ -298,7 +298,7 @@ rm -f "$ITER_JSON"
 1. **Context**: this is one iteration of the AFK work-problems loop. The user is AFK. The orchestrator selected `P<NNN> (<title>)` as the highest-WSJF actionable ticket.
 2. **Task**: apply the `/wr-itil:manage-problem` workflow for `work highest WSJF problem that can be progressed non-interactively as the user is AFK`. Follow manage-problem SKILL.md verbatim, including architect / jtbd / style-guide / voice-tone gate reviews and the commit gate (manage-problem Step 11). Because this subprocess has the Agent tool in its own surface, the normal review-via-subagent paths work — no inline-verdict fallback needed.
-3. **Constraints**: commit the completed work per ADR-014. Do NOT push, do NOT run `push:watch`, do NOT run `release:watch` — the orchestrator's Step 6.5 owns release cadence. Do NOT invoke `capture-*` background skills (AFK carve-out — ADR-032). Do NOT use `ScheduleWakeup` under any circumstance (P083 — iteration workers must not self-reschedule). **NEVER call `AskUserQuestion` mid-loop in AFK** (P135 / ADR-044): direction / deviation-approval / one-time-override / silent-framework observations queue at `ITERATION_SUMMARY.outstanding_questions` for loop-end batched presentation. Per-iter `AskUserQuestion` calls are sub-contracting framework-resolved decisions back to the user (lazy deferral per Step 2d Ask Hygiene Pass classification). Non-interactive defaults apply per ADR-013 Rule 6 + ADR-044's framework-resolution boundary. **Treat the user as transient** (P130): even when observably present at orchestrator dispatch time, the user may answer one question and disappear for hours; presence is not a reliable signal and is not the goal. The iter's job is to progress the ticket and accumulate questions for batched surfacing — not to ask "is it OK to proceed?" at a mechanical-stage boundary.
+3. **Constraints**: commit the completed work per ADR-014. Do NOT push, do NOT run `push:watch`, do NOT run `release:watch` — the orchestrator's Step 6.5 owns release cadence. Do NOT invoke `capture-*` background skills (AFK carve-out — ADR-032). Do NOT use `ScheduleWakeup` under any circumstance (P083 — iteration workers must not self-reschedule). **NEVER call `AskUserQuestion` mid-loop in AFK** (P135 / ADR-044): direction / deviation-approval / one-time-override / silent-framework observations queue at `ITERATION_SUMMARY.outstanding_questions` for loop-end batched presentation. Per-iter `AskUserQuestion` calls are sub-contracting framework-resolved decisions back to the user (lazy deferral per Step 2d Ask Hygiene Pass classification). Non-interactive defaults apply per ADR-013 Rule 6 + ADR-044's framework-resolution boundary. **Treat the user as transient** (P130): even when observably present at orchestrator dispatch time, the user may answer one question and disappear for hours; presence is not a reliable signal and is not the goal. The iter's job is to progress the ticket and accumulate questions for batched surfacing — not to ask "is it OK to proceed?" at a mechanical-stage boundary. **Do NOT poll `bats` output with a bats-console-summary regex against TAP-format output** (P146 — bash until-loop-deadlock antipattern). The bats-console-summary line `<N> tests, <M> failures` is emitted ONLY by bats's *default* (non-TAP) formatter; `bats --tap` does not emit a console summary, so a polling loop of shape `until [ -f $OUT ] && grep -qE '^[0-9]+ tests?,' $OUT; do sleep 5; done` spins forever after bats completes (silent deadlock — no error, no exit; recovery requires manual SIGTERM with metadata loss per the P146/P147 stuck-before-emit subclass). When you need to wait on a backgrounded bats run, prefer `wait $bg_pid` (Unix idiom — completion signaled by process exit, no regex required) or, for the Bash tool, `run_in_background=true` + `BashOutput` polling on the tool's exit-state field rather than regex-poll on stdout. If you genuinely must regex-poll TAP output, anchor on the TAP plan line `^[0-9]+\.\.[0-9]+` (e.g. `1..1455`) — TAP's plan line is emitted on completion and is format-stable across bats versions; the bats-console-summary line is not. The console-summary vs TAP-format divergence is the load-bearing detail: `bats` and `bats --tap` produce structurally different stdout, and the antipattern assumes the former when iter dispatch typically uses the latter.
 4. **Retro-on-exit (P086)**: before emitting `ITERATION_SUMMARY`, invoke `/wr-retrospective:run-retro`. Retro runs INSIDE this subprocess so its Step 2b pipeline-instability scan has access to the iteration's rich tool-call history (hook misbehaviour, repeat-workaround patterns, subagent-delegation friction, release-path instability). Retro may create tickets or update `docs/BRIEFING.md` — run-retro commits its own work per ADR-014; any tickets it creates ride into either the iteration's own commit (if retro runs before the main commit) or a retro-owned follow-up commit, and the orchestrator picks them up on the next Step 1 scan. Proceed to `ITERATION_SUMMARY` emission regardless of retro findings — retro is non-blocking (do not block on retro): if retro fails or surfaces findings, the iteration still returns a summary so the AFK loop does not silently halt on a flaky retro run.
 5. **Output**: end the final message with the `ITERATION_SUMMARY` block defined below — this is how the orchestrator consumes the iteration's result.
@@ -661,6 +661,7 @@ When every skipped ticket is in the `upstream-blocked` category (stop-condition
 ## Related
 - **P121** (`docs/problems/121-afk-orchestrator-should-sigterm-stuck-subprocesses-after-idle-timeout.verifying.md`) — driver for Step 5's backgrounded-poll-loop dispatch shape (replacing the prior foreground-synchronous form) and the idle-timeout SIGTERM branch. The 2026-04-25 P118 iter 5 evidence: an iteration subprocess sat idle ~70 min after its final commit, then SIGTERM produced a clean JSON exit-flush. Fix: orchestrator backgrounds the subprocess, polls every 60s, computes `LAST_ACTIVITY_MARK = max(DISPATCH_START_EPOCH, git log -1 --format=%at HEAD)`, and sends SIGTERM when `now - LAST_ACTIVITY_MARK > WORK_PROBLEMS_IDLE_TIMEOUT_S` (default 3600s = 60 min). Behavioural second-source: `test/work-problems-step-5-idle-timeout-sigterm.bats` exercises a fake `claude -p` shim that sleeps past the threshold and asserts SIGTERM, JSON exit-flush, env-var override, and within-threshold no-fire. Step 6's per-iter progress line SHOULD annotate `(SIGTERM_SENT)` when the branch fires so users can distinguish recovered iters from natural completions. ADR-032's subprocess-boundary variant amended 2026-04-26 with the backgrounded-poll-loop refinement.
+- **P146** (`docs/problems/146-afk-iteration-subprocess-bash-until-loop-polls-bats-output-with-bats-console-regex-against-tap-format.verifying.md`) — driver for Step 5 iteration prompt body's bats-output-polling-discipline clause. The 2026-04-29 incident (iter 1, PID 23580 child PID 16408) saw a `bash until`-loop poll a backgrounded bats output file with regex `^[0-9]+ tests?,` (bats's *default* console-summary format) against `bats --tap` output that never emits that line — silent infinite spin after bats completed; manual SIGTERM at 68m34s wall-clock; metadata loss per the P147 stuck-before-emit subclass. The polling idiom is NOT taught by any SKILL.md (audit confirmed via repo grep) — it is agent-learned from training data. Fix: prompt-discipline rule in the iteration prompt body's Constraints list explicitly forbidding the antipattern, naming `wait $bg_pid` (or Bash-tool `run_in_background=true` + `BashOutput`) as the safe substitute, and citing the TAP-vs-console-summary divergence so future contributors don't "fix" the rule incorrectly. Behavioural second-source: `test/work-problems-step-5-bats-polling-discipline.bats` asserts the prohibition phrase, the safe-substitute pointer, the P146 cite, the divergence explanation, and the Related-section cite.
 - **P147** (`docs/problems/147-p121-sigterm-clean-flush-guarantee-conditional-needs-skill-md-caveat-for-stuck-before-emit-subclass.verifying.md`) — refinement to P121's "clean exit-flush" claim. P118's evidence held only for subprocesses that had already emitted `ITERATION_SUMMARY` before going idle; the 2026-04-29 P146 incident produced exit 143 + 0-byte JSON when SIGTERM fired before `ITERATION_SUMMARY` emission. Fix: SKILL.md prose now carries the conditional caveat (Step 5 "SIGTERM exit-flush is conditional, not universal" subsection) and adopters reading the prose are directed to treat exit 143 + 0-byte JSON as a metadata-loss event — verify work integrity from `git log` + `git status --porcelain`, halt the AFK loop, and reconstruct cost from the Anthropic billing dashboard. Behavioural second-source extends `test/work-problems-step-5-idle-timeout-sigterm.bats` with a stuck-before-emit fake-shim asserting `JSON_BYTES=0` after SIGTERM. Mechanism unchanged (SIGTERM remains the right recovery primitive); the refinement is documentation accuracy + the metadata-loss-event handling shape.
 - **P089** (`docs/problems/089-work-problems-step-5-dispatch-robustness-stdin-warning-and-cost-metadata-edge-case.verifying.md`) — driver for Step 5's `< /dev/null` dispatch redirect and the Per-iteration cost metadata "Authority hierarchy" paragraph. Gap 1: stdin warning contaminated stderr-merged JSON captures; closed by adding `< /dev/null` to the canonical dispatch command. Gap 2: `.usage.*` undercounts when subprocess exits via a background-task completion ack while `.total_cost_usd` stays cumulative-authoritative; closed by documenting the authority hierarchy in Step 5 and the Session Cost output section so adopters trust cost and label token totals best-effort.
 - **P086** (`docs/problems/086-afk-iteration-subprocess-does-not-run-retro-before-returning.verifying.md`) — driver for Step 5's retro-on-exit clause. Iteration subprocesses exit without running retro, so per-iteration friction (hook misbehaviour, repeat-workaround patterns, pipeline instability) evaporates on exit. Fix: iteration prompt body names `/wr-retrospective:run-retro` as a closing step before `ITERATION_SUMMARY` emission; retro runs inside the subprocess so Step 2b pipeline-instability scan has the full tool-call history; run-retro commits its own work per ADR-014; orchestrator picks up retro-created tickets on the next Step 1 scan.

package/skills/work-problems/test/work-problems-step-5-bats-polling-discipline.bats ADDED Viewed

@@ -0,0 +1,91 @@
+#!/usr/bin/env bats
+# Doc-lint guard: work-problems SKILL.md Step 5 iteration prompt body must
+# carry a prompt-discipline rule forbidding the bash-until-loop bats-output-
+# polling antipattern that caused P146.
+#
+# Antipattern (observed 2026-04-29, iter 1, PID 23580 child PID 16408 —
+# 68m34s wall-clock burn + lost JSON metadata):
+#   until [ -f $TASK_OUTPUT ] && grep -qE '^[0-9]+ tests?,' $TASK_OUTPUT \
+#     2>/dev/null; do sleep 5; done; tail -25 $TASK_OUTPUT
+# The regex `^[0-9]+ tests?,` matches bats's *default* (non-TAP) console-
+# summary line. `bats --tap` never emits that line, so the until-loop spins
+# forever after bats completes. Silent deadlock; no error; no exit.
+#
+# The polling idiom is NOT taught by any SKILL.md (grep across the repo
+# returns only this ticket itself). It is agent-learned from training data —
+# a generic bash + bats Bourne idiom. The fix per the ticket strategy is a
+# prompt-discipline rule + behavioural assertion: the iteration prompt body
+# must explicitly forbid the antipattern AND name a safe substitute, so the
+# iteration agent does not fall back on the learned idiom.
+#
+# Structural assertion — Permitted Exception under ADR-037 (SKILL.md is
+# explicitly a contract document; doc-lint contract assertion against the
+# contract document itself is the named permitted pattern; same rationale
+# as work-problems-step-5-delegation.bats's @problem P083 / P089 fixtures).
+# Behavioural alternative would require spawning a real `claude -p`
+# subprocess and observing its tool-call traces; that harness sits outside
+# the skill layer.
+#
+# @problem P146
+# @jtbd JTBD-006
+# @jtbd JTBD-001
+# @jtbd JTBD-101
+#
+# Cross-reference:
+#   P146 (bash until-loop polls bats output with bats-console regex against
+#     TAP-format) — driver for this clause
+#   P147 (sibling — SIGTERM-flush metadata-loss caveat fires when an iter
+#     deadlocks before ITERATION_SUMMARY emission; P146's regex-poll is the
+#     concrete mechanism behind today's stuck-before-emit incident)
+#   ADR-037 (skill testing strategy — contract-assertion Permitted Exception)
+#   ADR-014 (single-commit grain — fix + bats + ticket transition land
+#     together)
+setup() {
+  SKILL_DIR="$(cd "$(dirname "$BATS_TEST_FILENAME")/.." && pwd)"
+  SKILL_FILE="${SKILL_DIR}/SKILL.md"
+}
+@test "SKILL.md Step 5 iteration prompt forbids bats-console-summary regex polling (P146)" {
+  # The iteration prompt body MUST explicitly name the antipattern so the
+  # iteration agent does not reach for the learned bash idiom. The
+  # prohibition phrase must reference the regex-poll-on-bats-output shape
+  # OR the bats-console-summary line shape, since both are the failure
+  # surface today.
+  run grep -niE "do not poll.{0,40}bats|bats.{0,40}console.summary.{0,40}regex|never poll.{0,40}bats|grep.{0,20}tests\\?,.{0,80}(forbidden|antipattern|do not|never)" "$SKILL_FILE"
+  [ "$status" -eq 0 ]
+}
+@test "SKILL.md Step 5 iteration prompt names a safe substitute for backgrounded bats waits (P146)" {
+  # Forbidding the antipattern without naming a substitute leaves the agent
+  # without a path forward, and the agent will silently fall back on the
+  # antipattern. The prompt must name `wait \$bg_pid` (Unix idiom) OR the
+  # Bash tool's `run_in_background=true` + BashOutput check (Claude Code
+  # idiom) as the recommended replacement.
+  run grep -niE "wait[[:space:]]+[\"']?\\\$[A-Z_]*pid|wait[[:space:]]+[\"']?\\\$[A-Z_]*PID|run_in_background.{0,80}BashOutput|BashOutput.{0,80}run_in_background|background.{0,40}wait.{0,40}exit" "$SKILL_FILE"
+  [ "$status" -eq 0 ]
+}
+@test "SKILL.md Step 5 bats-polling clause cites P146 (self-documenting contract)" {
+  # The clause must cite P146 inline so a future contributor reading the
+  # rule understands why it exists before deleting it (mirrors the P083
+  # ScheduleWakeup-clause discipline already in place at line ~301).
+  run grep -nE "P146" "$SKILL_FILE"
+  [ "$status" -eq 0 ]
+}
+@test "SKILL.md Step 5 bats-polling clause names the TAP/console-summary divergence (P146)" {
+  # The clause must explain WHY the regex `^[0-9]+ tests?,` fails — the
+  # divergence between bats's default (console-summary) and `--tap` output
+  # formats. Without this, a future contributor who switches from `--tap`
+  # to default formatting may "fix" the rule incorrectly.
+  run grep -niE "TAP.{0,80}console.summary|console.summary.{0,80}TAP|--tap.{0,80}(does not|never|no console summary)|tap.format.{0,80}(does not|never).{0,40}emit" "$SKILL_FILE"
+  [ "$status" -eq 0 ]
+}
+@test "SKILL.md Related section cites P146" {
+  # Self-documenting contract: ticket cited in Related so a future reader
+  # of the file finds the driving incident without scrolling to find it.
+  run grep -nE "P146" "$SKILL_FILE"
+  [ "$status" -eq 0 ]
+}