npm - @windyroad/itil - Versions diffs - 0.41.0 → 0.42.0 - Mend

@windyroad/itil 0.41.0 → 0.42.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (11) hide show

package/.claude-plugin/plugin.json +14 -1
package/agents/hang-off-check.md +146 -0
package/agents/test/fixtures/proceed-new-genuinely-new.md +43 -0
package/agents/test/fixtures/proceed-new-subtle-sibling.md +47 -0
package/agents/test/fixtures/regression-p347-vs-p346.md +54 -0
package/agents/test/hang-off-check.bats +225 -0
package/hooks/lib/changeset-detect.sh +113 -9
package/hooks/test/itil-changeset-discipline.bats +153 -0
package/package.json +1 -1
package/skills/capture-problem/SKILL.md +80 -3
package/skills/manage-problem/SKILL.md +24 -0

package/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,6 +1,19 @@
 {
   "description": "ITIL-aligned IT service management for Claude Code",
   "maturity": {
+    "agents": {
+      "hang-off-check": {
+        "band": "Experimental",
+        "computed_at": "2026-05-31T00:00:00Z",
+        "evidence": {
+          "breaking_change_age_days": null,
+          "closed_tickets_window": 0,
+          "days_shipped": 0,
+          "invocations_30d": null
+        },
+        "schema_version": "2.0"
+      }
+    },
     "band": "Experimental",
     "bootstrapping": true,
     "hooks": {
@@ -484,5 +497,5 @@
     }
   },
   "name": "wr-itil",
-  "version": "0.41.0"
+  "version": "0.42.0"
 }

package/agents/hang-off-check.md ADDED Viewed

@@ -0,0 +1,146 @@
+---
+name: hang-off-check
+description: Capture-time inflow-discipline arbiter for problem tickets. Given a
+  new capture's description plus a filtered candidate ticket list, returns a
+  structured verdict — HANG_OFF P<NNN> when the new scope belongs as an
+  Investigation Tasks expansion / Phase N section on an existing parent ticket,
+  or PROCEED_NEW when no candidate absorbs the new scope. Spawned fresh from
+  inside /wr-itil:capture-problem and /wr-itil:manage-problem Step 2 to avoid
+  the calling agent's session-context bias. Read-only. Codified as ADR-032's
+  5th invocation pattern under the P346 amendment.
+tools:
+  - Read
+  - Glob
+  - Grep
+model: inherit
+---
+# @jtbd JTBD-001, JTBD-006, JTBD-101, JTBD-201
+You are the Hang-Off Check arbiter. You decide — for a new problem-ticket capture, given a mechanically pre-filtered set of candidate parent tickets — whether the new scope belongs absorbed into an existing parent (`HANG_OFF: P<NNN>`) or genuinely deserves its own new ticket (`PROCEED_NEW`).
+You are a reviewer, not an editor. You read inputs and emit a structured verdict. You do not modify any files; the calling skill acts on your verdict.
+## Driver
+You exist because session-context bias on the calling main agent is structurally guaranteed: the main agent mid-iter has just been working on related artefacts and pattern-matches existing capture flows, missing hang-off opportunities (the wrongly-captured P347 sibling of P346 on 2026-05-31 is the canonical regression). Your fresh context is the fix — you read only the structured inputs and reason about candidate absorption without the bias.
+Same architectural pattern as `wr-architect:agent` / `wr-jtbd:agent` / `tdd:review-test` / `wr-risk-scorer:pipeline` — codified in ADR-032 as the 5th invocation pattern (P346 amendment, 2026-05-31).
+## Your Inputs
+The calling skill passes you a structured prompt containing two payloads:
+1. **New capture description** — the free-text observation the user / agent wants to capture as a new problem ticket (the leading flags stripped, kebab-title slug derivable but not yet committed).
+2. **Filtered candidate ticket list** — the result of the calling skill's mechanical pre-filter: candidates from `docs/problems/open/` + `docs/problems/verifying/` that share ≥1 signal with the description (ADR-NNN ref, SKILL path, file path, or named feature). The list is capped at 5 candidates per ADR-032's latency-bound contract; wider filtered sets short-circuit to PROCEED_NEW without invoking you.
+For each candidate the calling skill passes:
+- Ticket ID (`P<NNN>`)
+- Title
+- File path (so you can `Read` the full body when needed)
+- The matching signals (which ADR / SKILL / file the pre-filter saw shared with the description)
+You may `Read` candidate ticket files in full to evaluate their scope, Investigation Tasks state, and multi-phase scope sections. You may `Grep` / `Glob` to follow references the description or candidates cite. You SHOULD NOT load unrelated files; your reasoning should be grounded in the explicit input payload + the candidate ticket bodies.
+## How You Decide
+For each candidate, ask: **does the new capture belong inside this candidate as scope expansion (Investigation Tasks bullet, Phase N section, or sibling-finding under the same root cause), or does it stand alone as a distinct problem?**
+A new capture HANGS OFF a candidate when:
+- The candidate is a **master ticket** for a multi-phase fix and the new capture is one phase / one sub-class of that work (P346's three-phase scope is the canonical example — Phase 3 work belongs inside P346 as Phase 3, not as a sibling P347).
+- The candidate's `## Multi-phase scope` / `## Investigation Tasks` / `## Root Cause Analysis` section explicitly names work the new capture is doing (or is a natural extension of work the candidate names as in-scope).
+- The candidate's root cause + the new capture's root cause are the same observable phenomenon, just surfaced at different times or by different signals.
+- The new capture's description IS the candidate's deferred follow-up (the candidate's `## Fix Strategy` or `## Investigation Tasks` flags the work as "deferred to sibling ticket" but the sibling is actually scope expansion on this very ticket).
+A new capture PROCEEDS as new when:
+- The candidate's root cause is genuinely distinct from the new capture's root cause (shared keywords / shared file paths can mislead — a SKILL.md edit in capture-problem can be about three different problems with three different fix loci).
+- The candidate is in Verifying lifecycle and the new capture is post-verifying-close discovery (the candidate is shipping its fix; the new capture is a fresh observation that needs its own intake).
+- The candidate is a **sibling** to what the new capture is about (both are surfaces of a common parent that neither candidate IS) — in this case, recommend `PROCEED_NEW` and let `/wr-itil:review-problems` cluster them later.
+- The new capture would force the candidate's scope to grow past its INVEST shape (single-purpose-anchor; multi-concern dilution).
+- The new capture's `## Description` framing is fundamentally different from the candidate's even if surface signals overlap.
+When in doubt, prefer **PROCEED_NEW** — false-negative on hang-off is cheaper than false-positive (false-positive silently swallows distinct work into the wrong parent; false-negative just defers consolidation to the next `/wr-itil:review-problems` cluster pass). This mirrors `/wr-itil:capture-problem` Step 2's existing "false-positives are cheaper than false-negatives" framing.
+Under `--no-prompt` / AFK propagation, ambiguous-multi-parent cases also collapse to **PROCEED_NEW** (safe-default, no `AskUserQuestion` fallback — ADR-013 Rule 6 fail-safe per ADR-032 amendment).
+## How to Report
+Emit one of two structured verdict shapes. The verdict line is parsed by the calling skill; the rationale block is preserved as the audit trail.
+### When the new capture hangs off an existing parent
+```
+HANG_OFF: P<NNN>
+**Rationale**: <one or two sentences naming the candidate's master-ticket / multi-phase / scope-expansion shape and why this new capture belongs inside it>.
+**Signals matched**: <comma-separated list of the specific signals — e.g. "shared ADR-079 reference", "candidate's Investigation Tasks Phase 3 section names this work", "shared `packages/itil/agents/hang-off-check.md` file path", "candidate's Fix Strategy deferred this exact scope">.
+**Where to absorb**: <one sentence naming where on the candidate ticket the new scope lands — e.g. "amend candidate's Investigation Tasks checklist with [the deliverable]", "expand candidate's `### Phase N — <name>` section with [the new substance]", "append to candidate's Symptoms / Workaround section">.
+```
+### When the new capture proceeds as a new ticket
+```
+PROCEED_NEW
+**Rationale**: <one or two sentences explaining why no candidate absorbs the new scope, even if surface signals overlap>.
+**Per-candidate explanation**: for each candidate the pre-filter surfaced, one short line naming what distinguishes the new capture from that candidate (root cause / lifecycle phase / scope grain / persona / surface).
+```
+## Output Formatting
+When referencing decision IDs (ADR-<NNN>), problem IDs (P<NNN>), RFC IDs (RFC-<NNN>), or JTBD IDs in prose, always include a human-readable hint on first mention. Use `P346 (review-problems backlog-flow-control master ticket)`, not bare `P346`. This matches `wr-architect:agent` / `wr-jtbd:agent` output formatting conventions per their P032 contracts.
+## Scope and Firewalls
+### Maintainer-side only (JTBD-301 firewall)
+You fire on maintainer-side `/wr-itil:capture-problem` and maintainer-internal `/wr-itil:manage-problem` invocations only. You DO NOT fire on:
+- Plugin-user-side intake via `.github/ISSUE_TEMPLATE/problem-report.yml` (plugin-user descriptions do not carry the same authorial intent as maintainer-internal captures — a plugin-user describing their friction in maintainer vocabulary could plausibly trigger a wrong-parent HANG_OFF). Triage during `/wr-itil:manage-problem` ingestion stays user-judgement per JTBD-301.
+- `/wr-itil:manage-problem`'s ingestion-of-plugin-user-reports path (mirrors the lexical-classifier firewall at `packages/itil/skills/capture-problem/SKILL.md` line 116).
+### Cardinality
+One verdict per invocation. The calling skill captures one ticket per invocation; you arbitrate the absorb-or-proceed decision for that single capture against its filtered candidate set.
+### Out of Scope
+- You do NOT decide WSJF priority, effort, or any other field on the new capture or the candidate ticket. The calling skill owns those fields per its own SKILL contract.
+- You do NOT amend any candidate ticket bodies. On HANG_OFF, the calling skill returns control to the orchestrator agent with a halt-and-route directive; the orchestrator amends the named candidate per its standard ticket-edit flow.
+- You do NOT search for candidates the pre-filter did not surface. Your input is the pre-filtered set; if the pre-filter missed a candidate, the failure mode is wrong-PROCEED_NEW (correctable at the next `/wr-itil:review-problems` cluster pass), not silent absorption.
+## Behavioural verification
+The canonical behavioural fixture is the P347-vs-P346 regression — `packages/itil/agents/test/fixtures/regression-p347-vs-p346.md` captures the input shape:
+- New capture description: P347's original description (about "Phase 2 evidence shape expansion" work).
+- Candidate set: contains P346 (the master backlog-flow-control ticket with its Multi-phase scope section explicitly naming Phase 2 as in-scope).
+- Expected verdict: `HANG_OFF: P346` with rationale citing the shared ADR-079 reference + the candidate's Multi-phase scope section explicitly naming Phase 2.
+Two further canonical fixtures live under the same path: `fixtures/proceed-new-genuinely-new.md` (no real candidates → PROCEED_NEW) and `fixtures/proceed-new-subtle-sibling.md` (P070 vs a new report-upstream surface ticket on a different SKILL → PROCEED_NEW with reasoned per-candidate rationale).
+Behavioural execution of these fixtures lands under RFC-012 (promptfoo eval harness — proposed). Until RFC-012 ships, the bats fixtures at `packages/itil/agents/test/hang-off-check.bats` are structural assertions on this agent's prose contract per ADR-052 Surface 2 (with the P176 harness-gap carve-out the architect and JTBD reviewer agents document); they verify the verdict format is documented, the firewall is named, the safe-default behaviour is specified, and the fixture files exist with the expected input shape.
+## Related
+- **ADR-032** (Governance-skill invocation patterns) — the 5th invocation pattern (Foreground fresh-context-subagent-as-decision-arbiter) under the P346 amendment 2026-05-31. This agent IS the worked example.
+- **ADR-013** (Structured user interaction for governance decisions) — Rule 6 fail-safe; you never invoke `AskUserQuestion`; ambiguous-multi-parent collapses to PROCEED_NEW under AFK propagation.
+- **ADR-026** (Agent output grounding) — your rationale MUST cite observable signals (specific ADR refs, specific file paths, specific candidate ticket section names); no qualitative claims.
+- **ADR-049** (Plugin scripts via `bin/` on PATH) — not directly relevant; this agent is loaded via Agent tool, not PATH.
+- **ADR-052** (Behavioural-tests default) — bats fixtures for this agent are structural per Surface 2 carve-out + P176 harness gap; behavioural eval lands under RFC-012.
+- **ADR-075** (promptfoo as agent-prose verdict harness) — future home of the behavioural fixtures.
+- **RFC-012** (promptfoo retrofit, proposed) — will run the canonical behavioural fixtures.
+- **RFC-013** (P346 backlog flow control multi-phase, proposed) — traces P346 Phases 1+2+3; this agent is part of Phase 3's deliverable.
+- **P346** (review-problems backlog-flow-control master ticket — `docs/problems/open/346-...md`) — driver ticket. Phase 3 spec authored in P346's body; codified as ADR-032's 5th pattern.
+- **P347** (closed as duplicate-of-P346) — the wrongly-captured sibling that motivated this agent's existence; the canonical regression fixture.
+- **P176** (agent-side I2 / harness gap) — the structural-bats Surface 2 carve-out precedent.
+- **JTBD-001** (Enforce Governance Without Slowing Down) — the pre-filter latency cap + verdict-acts contract keeps capture under the 60s flow budget.
+- **JTBD-006** (Progress the Backlog While I'm Away) — verdict is deterministic, never blocks on AskUserQuestion; AFK-safe.
+- **JTBD-101** (Extend the Suite with New Plugins) — the fresh-context-subagent-as-decision-arbiter pattern is reusable for future capture-time discipline needs.
+- **JTBD-201** (Restore Service Fast with an Audit Trail) — HANG_OFF rationale is recorded on the absorbing ticket's Investigation Tasks bullet; PROCEED_NEW rationale lands on the captured ticket's `## Related` section so the next reviewer sees what was considered.
+- **`/wr-itil:capture-problem`** (`packages/itil/skills/capture-problem/SKILL.md` Step 2) — primary dispatch site.
+- **`/wr-itil:manage-problem`** (`packages/itil/skills/manage-problem/SKILL.md` Step 2) — secondary dispatch site (maintainer-internal new-problem path only; plugin-user-report ingestion path skips per JTBD-301 firewall).

package/agents/test/fixtures/proceed-new-genuinely-new.md ADDED Viewed

@@ -0,0 +1,43 @@
+# Behavioural fixture 2: genuinely-new capture (PROCEED_NEW)
+Validates that the agent does NOT spuriously fold a genuinely-new capture into a candidate just because the mechanical pre-filter surfaced one.
+## Input 1: New capture description
+```
+The `claude plugin marketplace update` command silently caches plugin
+metadata for 24 hours without surfacing the cache state in any UI. Adopters
+attempting to refresh after a release wait 24h not knowing the cache is
+serving stale data. Need a manual cache-bust flag or a TTL surface in the
+marketplace command's output. Affects every adopter waiting on a release.
+Verifiable by: invoke `claude plugin marketplace update` immediately after a
+release; observe no version-bump surface; wait 24h; observe version bump.
+```
+## Input 2: Filtered candidate set
+| Candidate | Title | Path | Matching signals |
+|-----------|-------|------|------------------|
+| P106 | `claude plugin install` is a silent no-op when already installed at any version | `docs/problems/open/106-...md` | shared `claude plugin` command surface; both touch the install/marketplace cache layer |
+## Expected verdict
+```
+PROCEED_NEW
+**Rationale**: P106's root cause is the `claude plugin install` command being
+a no-op when already installed at any version (no version-equality check).
+The new capture's root cause is a different surface: `claude plugin
+marketplace update`'s opaque 24-hour metadata cache. The two share the
+upstream "claude plugin marketplace cache layer" but operate at different
+commands with different observable symptoms and different fix loci. Folding
+the new capture into P106 would dilute P106's single-purpose anchor.
+**Per-candidate explanation**:
+- P106 (claude plugin install silent no-op) — different command (`install` vs
+  `marketplace update`); different observable (no version bump on already-
+  installed plugin vs no version bump because metadata cache is stale);
+  different fix locus (per-project install logic vs marketplace metadata
+  TTL/surfacing). Surface signals overlap (both touch the marketplace cache
+  layer) but the captured friction is genuinely distinct.
+```

package/agents/test/fixtures/proceed-new-subtle-sibling.md ADDED Viewed

@@ -0,0 +1,47 @@
+# Behavioural fixture 3: subtle sibling-vs-parent (PROCEED_NEW with reasoned rationale)
+Validates that the agent reasons cleanly about sibling-vs-parent shape — two tickets sharing surface signals where neither is the other's parent.
+## Input 1: New capture description
+```
+`/wr-itil:report-upstream` SKILL.md surface — the report-upstream skill's
+generated GitHub issue body sometimes carries our internal P<NNN> ticket IDs
+verbatim instead of stripping them. Adopters reporting upstream see opaque
+references they can't resolve. Should redact internal IDs before posting or
+include a "for our records" footer separating internal vs upstream
+references. Affects every outbound issue from /wr-itil:report-upstream.
+```
+## Input 2: Filtered candidate set
+| Candidate | Title | Path | Matching signals |
+|-----------|-------|------|------------------|
+| P070 | `/wr-itil:report-upstream` doesn't deduplicate against existing upstream issues | `docs/problems/verifying/070-...md` | shared `/wr-itil:report-upstream` SKILL reference; both about outbound surfaces on the same SKILL |
+## Expected verdict
+```
+PROCEED_NEW
+**Rationale**: P070's root cause is the report-upstream SKILL not checking
+the upstream repo for an existing issue before posting (deduplication gap).
+The new capture's root cause is the report-upstream SKILL leaking
+maintainer-internal P<NNN> references in the generated issue body (privacy /
+adopter-confusion gap). The two are siblings — both surface on
+`/wr-itil:report-upstream`'s outbound generation flow — but neither is the
+other's parent. The shared SKILL surface is coincidental; the fix loci are
+distinct (upstream-repo-state lookup vs body-text scrubbing). P070's scope
+does not name internal-ID-leakage as deferred follow-up. Recommend
+PROCEED_NEW; let `/wr-itil:review-problems` cluster them as sibling-class at
+the next cluster pass if a common parent emerges.
+**Per-candidate explanation**:
+- P070 (report-upstream deduplication gap) — same SKILL surface, but
+  different observable (duplicate upstream issues vs leaked internal IDs in
+  the body), different fix locus (gh API state lookup vs string scrubbing on
+  the body template), different lifecycle stage (P070 is Verifying with a
+  shipping fix; the new capture is fresh discovery on a different code
+  surface in the same file). Folding would dilute P070's single-purpose
+  anchor and force its Verifying transition to wait on un-related work.
+```

package/agents/test/fixtures/regression-p347-vs-p346.md ADDED Viewed

@@ -0,0 +1,54 @@
+# Behavioural fixture 1 (canonical regression): P347-vs-P346
+This is the canonical regression case for the `wr-itil:hang-off-check` agent. It captures the 2026-05-31 P347 wrongly-captured-sibling-of-P346 incident. If the agent receives this fixture's inputs and returns anything other than `HANG_OFF: P346`, the SKILL is insufficient and the regression has re-opened.
+Behavioural execution lands under RFC-012 (promptfoo eval harness, proposed). Until RFC-012 ships, this fixture is the documentation of the expected behaviour; the bats fixtures at `../hang-off-check.bats` are structural assertions on the agent's prose contract per ADR-052 Surface 2 carve-out.
+## Input 1: New capture description
+```
+P346 + ADR-079 Phase 2 — empirical foreground relevance-scan today (5 batches,
+14 closes) revealed 4 evidence shapes Phase 1 doesn't implement, plus the
+1 shape Phase 1 does implement had the highest false-positive rate. The four
+shapes: ADR-shipped-with-`human-oversight: confirmed`, named-skill-or-feature-
+exists, self-marker-in-body (P289-class), driver-child-ticket-closed (P155 →
+P014). Phase 1 false-positive fixes needed for state-suffix, sibling-file, and
+rename detection (via `git log --follow`). Behavioural bats extension from 18
+to 33 fixtures. `evaluate-relevance.sh` extension. Update
+`/wr-itil:review-problems` SKILL.md Step 4.6 + `/wr-itil:manage-problem`
+lifecycle table.
+```
+## Input 2: Filtered candidate set
+| Candidate | Title | Path | Matching signals |
+|-----------|-------|------|------------------|
+| P346 | `/wr-itil:review-problems` has no path to close tickets that are no longer relevant (evidence-based, NOT age-based) — structural outflow gap drives monotonic backlog growth | `docs/problems/open/346-...md` | shared ADR-079 ref; shared `packages/itil/scripts/evaluate-relevance.sh` path; shared `/wr-itil:review-problems` SKILL ref; shared `/wr-itil:manage-problem` SKILL ref; candidate's `## Multi-phase scope` section explicitly names Phase 2 as in-scope |
+## Expected verdict
+```
+HANG_OFF: P346
+**Rationale**: P346 is the master ticket for the framework's backlog-flow-control
+mechanisms with an explicit Multi-phase scope section. The new capture's
+description IS Phase 2 of P346 (additional evidence shapes + Phase 1
+false-positive fixes for the `evaluate-relevance.sh` script that P346 Phase 1
+introduced). P346's body already names Phase 2 work as in-scope.
+**Signals matched**: shared ADR-079 reference; shared
+`packages/itil/scripts/evaluate-relevance.sh` file path; shared
+`/wr-itil:review-problems` SKILL reference; shared `/wr-itil:manage-problem`
+SKILL reference; candidate's `## Multi-phase scope` section explicitly names
+Phase 2 as in-scope.
+**Where to absorb**: amend P346's Investigation Tasks checklist with the 4
+new evidence shapes + Phase 1 false-positive fix items; add a `### Phase 2 —
+evidence shape expansion + Phase 1 false-positive fixes` section under the
+existing Multi-phase scope section; the shipped commits' attribution
+references this absorption.
+```
+## Why this fixture is canonical
+The wrongly-captured P347 sibling of P346 on 2026-05-31 motivated the entire Phase 3 deliverable (this agent's existence). If a future change to the agent's prose contract regresses the verdict on this fixture, the SKILL no longer fulfils its driver. This is the binary tripwire.

package/agents/test/hang-off-check.bats ADDED Viewed

@@ -0,0 +1,225 @@
+#!/usr/bin/env bats
+# Doc-lint guard: wr-itil:hang-off-check agent contract — the agent MUST
+# carry the verdict format (HANG_OFF: P<NNN> | PROCEED_NEW), the
+# fresh-context subagent rationale, the JTBD-301 maintainer-side firewall,
+# the AFK safe-default (PROCEED_NEW on ambiguous), the rationale-citation
+# requirement, and the three canonical fixture files. Closes P346 Phase 3
+# deliverable — the SKILL's behavioural intent is documented and
+# verifiable.
+#
+# Structural assertion — ADR-052 Surface 2 (structural-justified) +
+# P176 harness gap. Behavioural execution of the three canonical fixtures
+# lands under RFC-012 (promptfoo eval harness, proposed). Upgrade these
+# to behavioural fixtures when RFC-012 ships.
+#
+# Cross-reference:
+#   P346 (review-problems backlog-flow-control master ticket; Phase 3
+#         deliverable this agent fulfils)
+#   P347 (closed as duplicate-of-P346; canonical regression case driving
+#         fixture 1)
+#   P176 (agent-side I2 / harness gap — Surface 2 carve-out precedent)
+#   ADR-032 (5th invocation pattern — fresh-context-subagent-as-decision-
+#            arbiter; P346 amendment 2026-05-31 codifies this agent's shape)
+#   ADR-052 (behavioural-tests default; Surface 2 carve-out)
+#   ADR-075 (promptfoo as agent-prose verdict harness — future home)
+#   RFC-012 (promptfoo retrofit — behavioural eval harness)
+#   RFC-013 (P346 multi-phase trace per ADR-071)
+#   @jtbd JTBD-001 (enforce governance without slowing down)
+#   @jtbd JTBD-006 (progress backlog while I'm away — AFK safe-default)
+#   @jtbd JTBD-101 (extend suite with new plugins — pattern reuse)
+#   @jtbd JTBD-201 (restore service fast with an audit trail — rationale)
+setup() {
+  AGENT_DIR="$(cd "$(dirname "$BATS_TEST_FILENAME")/.." && pwd)"
+  AGENT_FILE="${AGENT_DIR}/hang-off-check.md"
+  FIXTURES_DIR="${AGENT_DIR}/test/fixtures"
+}
+# ----- Contract surface: verdict format + structure -----
+@test "agent.md exists at packages/itil/agents/hang-off-check.md" {
+  [ -f "$AGENT_FILE" ]
+}
+@test "agent.md frontmatter declares name: hang-off-check" {
+  run grep -nE "^name: hang-off-check$" "$AGENT_FILE"
+  [ "$status" -eq 0 ]
+}
+@test "agent.md frontmatter limits tools to read-only set (Read, Glob, Grep)" {
+  # No Edit, no Write, no Bash — read-only reviewer per ADR-032 5th pattern
+  run grep -nE "tools:" "$AGENT_FILE"
+  [ "$status" -eq 0 ]
+  run grep -nE "^  - (Read|Glob|Grep)$" "$AGENT_FILE"
+  [ "$status" -eq 0 ]
+  # Forbid Edit / Write / Bash in the tools list
+  ! grep -nE "^  - (Edit|Write|Bash|MultiEdit|NotebookEdit)$" "$AGENT_FILE"
+}
+@test "agent.md declares HANG_OFF: P<NNN> verdict shape" {
+  run grep -nE 'HANG_OFF:\s*P<NNN>' "$AGENT_FILE"
+  [ "$status" -eq 0 ]
+}
+@test "agent.md declares PROCEED_NEW verdict shape" {
+  run grep -nE "PROCEED_NEW" "$AGENT_FILE"
+  [ "$status" -eq 0 ]
+}
+@test "agent.md requires a Rationale section on every verdict" {
+  run grep -nE "\*\*Rationale\*\*" "$AGENT_FILE"
+  [ "$status" -eq 0 ]
+}
+@test "agent.md requires Signals matched citation on HANG_OFF (rationale-grounding per ADR-026)" {
+  run grep -nE "\*\*Signals matched\*\*" "$AGENT_FILE"
+  [ "$status" -eq 0 ]
+}
+@test "agent.md requires Where to absorb directive on HANG_OFF (calling-skill action contract)" {
+  run grep -nE "\*\*Where to absorb\*\*" "$AGENT_FILE"
+  [ "$status" -eq 0 ]
+}
+@test "agent.md requires Per-candidate explanation on PROCEED_NEW" {
+  run grep -nE "\*\*Per-candidate explanation\*\*" "$AGENT_FILE"
+  [ "$status" -eq 0 ]
+}
+# ----- Driver / context-isolation rationale -----
+@test "agent.md cites the fresh-context / session-context-bias driver" {
+  run grep -niE "(session-context bias|fresh context|context isolation)" "$AGENT_FILE"
+  [ "$status" -eq 0 ]
+}
+@test "agent.md cites the P347-vs-P346 canonical regression in the driver section" {
+  run grep -nE "P347" "$AGENT_FILE"
+  [ "$status" -eq 0 ]
+  run grep -nE "P346" "$AGENT_FILE"
+  [ "$status" -eq 0 ]
+}
+@test "agent.md cites ADR-032 5th-pattern codification (P346 amendment)" {
+  run grep -nE "ADR-032" "$AGENT_FILE"
+  [ "$status" -eq 0 ]
+  run grep -niE "5th (invocation )?pattern" "$AGENT_FILE"
+  [ "$status" -eq 0 ]
+}
+# ----- Decision rule -----
+@test "agent.md names master-ticket / multi-phase as a HANG_OFF signal" {
+  run grep -niE "(master ticket|multi-phase|scope expansion)" "$AGENT_FILE"
+  [ "$status" -eq 0 ]
+}
+@test "agent.md names false-negative-cheaper-than-false-positive safe-default" {
+  run grep -niE "false[-]positive.*cheap|cheap.*false[-]positive" "$AGENT_FILE"
+  [ "$status" -eq 0 ]
+}
+@test "agent.md cites Rule 6 fail-safe / AFK safe-default (ambiguous → PROCEED_NEW)" {
+  run grep -niE "(ambiguous.*PROCEED_NEW|--no-prompt|AFK propagation)" "$AGENT_FILE"
+  [ "$status" -eq 0 ]
+  run grep -nE "ADR-013" "$AGENT_FILE"
+  [ "$status" -eq 0 ]
+}
+@test "agent.md explicitly forbids AskUserQuestion invocation (Rule 6)" {
+  run grep -niE "(never|do not|MUST NOT).*(AskUserQuestion|invoke.*AskUserQuestion)" "$AGENT_FILE"
+  [ "$status" -eq 0 ]
+}
+# ----- Scope / firewalls -----
+@test "agent.md names the JTBD-301 maintainer-side firewall" {
+  run grep -nE "JTBD-301" "$AGENT_FILE"
+  [ "$status" -eq 0 ]
+  run grep -niE "maintainer-side|maintainer-internal" "$AGENT_FILE"
+  [ "$status" -eq 0 ]
+}
+@test "agent.md excludes plugin-user-side intake from the dispatch (firewall)" {
+  run grep -niE "plugin-user-side|problem-report\.yml" "$AGENT_FILE"
+  [ "$status" -eq 0 ]
+}
+@test "agent.md excludes manage-problem ingestion-of-plugin-user-reports path" {
+  run grep -niE "ingestion[- ]of[- ]plugin[- ]user[- ]reports" "$AGENT_FILE"
+  [ "$status" -eq 0 ]
+}
+@test "agent.md scopes cardinality to one verdict per invocation" {
+  run grep -niE "one verdict per invocation|cardinality" "$AGENT_FILE"
+  [ "$status" -eq 0 ]
+}
+# ----- Output formatting (matches architect / jtbd output formatting convention) -----
+@test "agent.md carries Output Formatting section requiring human-readable IDs" {
+  run grep -nE "## Output Formatting" "$AGENT_FILE"
+  [ "$status" -eq 0 ]
+}
+# ----- JTBD annotation (per JTBD review nit 5) -----
+@test "agent.md carries @jtbd annotation citing JTBD-001/006/101/201" {
+  run grep -nE "^# @jtbd .*JTBD-001.*JTBD-006.*JTBD-101.*JTBD-201" "$AGENT_FILE"
+  [ "$status" -eq 0 ]
+}
+# ----- Canonical behavioural fixtures (RFC-012 future-home) -----
+@test "fixture 1 (canonical P347-vs-P346 regression) exists" {
+  [ -f "${FIXTURES_DIR}/regression-p347-vs-p346.md" ]
+}
+@test "fixture 1 names HANG_OFF: P346 as the expected verdict" {
+  run grep -nE "HANG_OFF: P346" "${FIXTURES_DIR}/regression-p347-vs-p346.md"
+  [ "$status" -eq 0 ]
+}
+@test "fixture 1 cites P347 as the wrongly-captured-sibling motivator" {
+  run grep -nE "P347" "${FIXTURES_DIR}/regression-p347-vs-p346.md"
+  [ "$status" -eq 0 ]
+}
+@test "fixture 2 (genuinely-new) exists" {
+  [ -f "${FIXTURES_DIR}/proceed-new-genuinely-new.md" ]
+}
+@test "fixture 2 names PROCEED_NEW as the expected verdict" {
+  run grep -nE "PROCEED_NEW" "${FIXTURES_DIR}/proceed-new-genuinely-new.md"
+  [ "$status" -eq 0 ]
+}
+@test "fixture 3 (subtle sibling-vs-parent) exists" {
+  [ -f "${FIXTURES_DIR}/proceed-new-subtle-sibling.md" ]
+}
+@test "fixture 3 names PROCEED_NEW with reasoned per-candidate rationale" {
+  run grep -nE "PROCEED_NEW" "${FIXTURES_DIR}/proceed-new-subtle-sibling.md"
+  [ "$status" -eq 0 ]
+  run grep -niE "Per-candidate explanation" "${FIXTURES_DIR}/proceed-new-subtle-sibling.md"
+  [ "$status" -eq 0 ]
+}
+# ----- RFC-012 forward-reference -----
+@test "agent.md cross-references RFC-012 as future behavioural-eval home" {
+  run grep -nE "RFC-012" "$AGENT_FILE"
+  [ "$status" -eq 0 ]
+}
+# ----- Cross-references to dispatch sites -----
+@test "agent.md cross-references /wr-itil:capture-problem as primary dispatch site" {
+  run grep -nE "/wr-itil:capture-problem" "$AGENT_FILE"
+  [ "$status" -eq 0 ]
+}
+@test "agent.md cross-references /wr-itil:manage-problem as secondary dispatch site" {
+  run grep -nE "/wr-itil:manage-problem" "$AGENT_FILE"
+  [ "$status" -eq 0 ]
+}

package/hooks/lib/changeset-detect.sh CHANGED Viewed

@@ -31,8 +31,29 @@
 #           - otherwise: publishable source — record the slug.
 #       * any other path: ignored (non-publishable surface — `.github/`,
 #         root config, top-level `docs/`, etc.).
-#   - If any path is publishable source AND no valid changeset is
-#     staged, return 1 + echo the slug.
+#   - If any path is publishable source:
+#       * **Check 2a (Phase 1)**: a `.changeset/*.md` (or held-window
+#         `docs/changesets-holding/*.md` per P177) staged → allow.
+#       * **Check 2b (Phase 2)**: an in-scope `.changeset/*.md` (or
+#         held-window entry) targeting the plugin via YAML frontmatter
+#         `"@windyroad/<slug>": <any-bump>` → allow. Scope =
+#         in-unpushed-range additions (`<base>..HEAD`) + untracked
+#         working-tree files + modified-not-staged working-tree files.
+#         Base = `@{u}` (current branch upstream) with fallback to
+#         `origin/main`. Once consumed onto origin (drained by
+#         changesets-action), the changeset is gone and a fresh one
+#         is required.
+#       * Neither check satisfied → return 1 + echo the slug.
+#
+# Phase 2 rationale (P141 2026-05-31): AFK orchestrator iters that
+# ship a multi-commit slice for one plugin (e.g. P346 Phase 3 across
+# 4 commits, 2 of which touched `packages/itil/`) should not author N
+# redundant changesets for one logical bump. changesets-action
+# collapses bump-class at version-package time, so per-commit
+# changesets render N CHANGELOG bullets for one release entry. Phase
+# 2's Check 2b lets the author write the changeset on the FIRST
+# commit; subsequent same-plugin commits naturally allow because the
+# changeset is already in the unpushed-range scope.
 #
 # Bypass:
 #   - `BYPASS_CHANGESET_GATE=1` env var → return 0 (allow). For
@@ -77,14 +98,85 @@
 #              shape — per-invocation deterministic, no markers).
 #   P141     — this helper.
+# P141 Phase 2 helper — does any `.changeset/*.md` (or held entry under
+# `docs/changesets-holding/*.md`) ALREADY in scope target the plugin
+# slug via its YAML frontmatter `"@windyroad/<slug>": <bump>` line?
+#
+# Scope = files reachable from HEAD but not from `origin/<base>`,
+# plus untracked working-tree changesets, plus modified-not-staged
+# changesets. Once a changeset is on `origin/<base>` (drained by
+# changesets-action at release time), it no longer counts — Check 2b
+# requires a fresh changeset for the next slice.
+#
+# Per-plugin granularity (NOT per-bump-class — changesets-action
+# collapses bump-class at version-package time when multiple
+# changesets for the same plugin merge; the published bump-class is
+# the maximum across the merged set).
+#
+# Base resolution: prefer the current branch's upstream (`@{u}`),
+# fall back to `origin/main`. If neither resolves (e.g. fresh
+# repo with no remotes), Check 2b returns 1 (no in-range scope to
+# inspect) — Phase 1 strict-deny behaviour is preserved.
+#
+# Returns: 0 (an in-scope changeset covers the plugin → allow)
+#          1 (no covering changeset found → caller falls through)
+_changeset_in_scope_covers_plugin() {
+  local slug="$1"
+  local base
+  local candidates path
+  base=$(git rev-parse --abbrev-ref --symbolic-full-name '@{u}' 2>/dev/null) \
+    || base="origin/main"
+  git rev-parse --verify --quiet "$base" >/dev/null 2>&1 || return 1
+  # Enumerate candidate changeset files:
+  #   1. In-range additions: changesets added in unpushed commits
+  #      (`<base>..HEAD`). A changeset later deleted in the same
+  #      range is filtered by the on-disk existence check below.
+  #   2. Untracked: changesets in the working tree not yet tracked
+  #      by git (author wrote but did not stage).
+  #   3. Modified-not-staged: changesets edited since their last
+  #      commit but not yet re-staged.
+  # Excludes `*/README.md` meta-docs (mirrors the staged-path branch).
+  candidates=$(
+    {
+      git log --diff-filter=A --name-only --pretty=format: "${base}..HEAD" \
+        -- '.changeset/*.md' 'docs/changesets-holding/*.md' 2>/dev/null
+      git ls-files --others --exclude-standard \
+        -- '.changeset/*.md' 'docs/changesets-holding/*.md' 2>/dev/null
+      git diff --name-only \
+        -- '.changeset/*.md' 'docs/changesets-holding/*.md' 2>/dev/null
+    } | grep -v '/README\.md$' | sort -u
+  )
+  [ -n "$candidates" ] || return 1
+  while IFS= read -r path; do
+    [ -n "$path" ] || continue
+    [ -f "$path" ] || continue
+    # Extract YAML frontmatter (lines between the first two `---`
+    # markers) and match the canonical `"@windyroad/<slug>":` line.
+    # awk scoping prevents false positives from prose body mentions.
+    if awk '/^---[[:space:]]*$/ { c++; if (c == 1) next; if (c == 2) exit } c == 1 { print }' "$path" 2>/dev/null \
+        | grep -qE "^\"@windyroad/${slug}\":[[:space:]]"; then
+      return 0
+    fi
+  done <<EOF
+$candidates
+EOF
+  return 1
+}
 # Detect whether the current staged set requires a changeset that is
-# not staged.
+# not satisfied by either staged Check 2a or in-scope Check 2b.
 #
 # Echoes the offending plugin slug on stdout when detected.
 #
 # Returns:
-#   0 — no change required, or BYPASS env set, or fail-open (allow)
-#   1 — change required + no changeset staged (caller should deny)
+#   0 — no change required, BYPASS env set, fail-open, or an in-scope
+#       changeset covers the plugin (Phase 2 Check 2b)
+#   1 — change required + no covering changeset (caller should deny)
 detect_changeset_required() {
   # Bypass via env var — single most-common legitimate escape.
   if [ "${BYPASS_CHANGESET_GATE:-}" = "1" ]; then
@@ -170,10 +262,22 @@ detect_changeset_required() {
 $staged
 EOF
-  if [ -n "$plugin_source_slug" ] && [ "$has_changeset" -eq 0 ]; then
-    printf '%s\n' "$plugin_source_slug"
-    return 1
+  # No publishable plugin source staged → allow.
+  [ -n "$plugin_source_slug" ] || return 0
+  # Check 2a — staged changeset satisfies (Phase 1 behaviour).
+  if [ "$has_changeset" -eq 1 ]; then
+    return 0
+  fi
+  # Check 2b (P141 Phase 2) — in-scope changeset targeting the plugin
+  # satisfies. Scope = unpushed-range commits + untracked + modified-
+  # not-staged working-tree files. Once consumed onto origin, the
+  # changeset is gone and a fresh one is required.
+  if _changeset_in_scope_covers_plugin "$plugin_source_slug"; then
+    return 0
   fi
-  return 0
+  printf '%s\n' "$plugin_source_slug"
+  return 1
 }

package/hooks/test/itil-changeset-discipline.bats CHANGED Viewed

@@ -421,3 +421,156 @@ run_bash_hook() {
   [ "$status" -eq 0 ]
   [[ "$output" == *"\"permissionDecision\": \"deny\""* ]]
 }
+# --- P141 Phase 2: in-scope-changeset coverage for multi-commit slices ---
+#
+# Phase 2 amendment (2026-05-31) widens the allow path: a `.changeset/*.md`
+# already in the unpushed slice scope (committed in a prior unpushed commit,
+# untracked in the working tree, or modified-not-staged) that targets the
+# plugin via its YAML frontmatter `"@windyroad/<plugin>": <any-bump>` line
+# also satisfies the gate. Phase 1 strict-deny behaviour preserved for the
+# no-coverage case.
+#
+# Scope boundary: `<base>..HEAD` where base = `@{u}` upstream tracking branch
+# with fallback to `origin/main`. Once a changeset is on `origin/<base>`
+# (drained by changesets-action), it no longer counts — Phase 2 boundary
+# fixture below proves this.
+#
+# Per-plugin granularity: an `@windyroad/itil` changeset does NOT cover a
+# `packages/voice-tone/` commit — wrong-plugin negative fixture below.
+# Helper: mark the current HEAD as `origin/main` so subsequent commits
+# fall into the unpushed-range scope `origin/main..HEAD`. The bats setup
+# creates a local repo with no remote; this synthesises an origin/main
+# ref via `git update-ref` for behavioural testing.
+mark_origin_at_head() {
+  git update-ref refs/remotes/origin/main HEAD
+}
+@test "P141 Phase 2 allow: in-range committed changeset for plugin covers subsequent same-plugin commit" {
+  mark_origin_at_head
+  # Commit 1: ship the changeset + initial source together (Phase 1 case).
+  echo "skill body 1" > packages/itil/skills/foo/SKILL.md
+  printf -- '---\n"@windyroad/itil": patch\n---\nfix the thing\n' > .changeset/wr-itil-p347.md
+  git add packages/itil/skills/foo/SKILL.md .changeset/wr-itil-p347.md
+  git -c commit.gpgsign=false commit --quiet -m "feat 1"
+  # Commit 2: stage more itil source — no new changeset, but the in-range
+  # changeset from commit 1 covers @windyroad/itil. Gate must allow.
+  echo "more skill" > packages/itil/skills/foo/SKILL.md
+  git add packages/itil/skills/foo/SKILL.md
+  run run_bash_hook "git commit -m 'feat 2'"
+  [ "$status" -eq 0 ]
+  [[ "$output" != *"\"permissionDecision\": \"deny\""* ]]
+  # Silent pass per ADR-045 Pattern 1.
+  [ "${#output}" -eq 0 ]
+}
+@test "P141 Phase 2 deny boundary: changeset consumed onto origin no longer counts; fresh required" {
+  # Commit 1: ship the changeset onto the base.
+  printf -- '---\n"@windyroad/itil": patch\n---\nfix the thing\n' > .changeset/wr-itil-p347.md
+  git add .changeset/wr-itil-p347.md
+  git -c commit.gpgsign=false commit --quiet -m "changeset on base"
+  # Mark as drained-to-origin — changesets-action consumed it at release.
+  mark_origin_at_head
+  # Remove the file as changesets-action would on consumption.
+  git rm --quiet .changeset/wr-itil-p347.md
+  git -c commit.gpgsign=false commit --quiet -m "changeset consumed on origin"
+  mark_origin_at_head
+  # Now stage fresh itil source — no in-range changeset, no staged changeset.
+  echo "skill body" > packages/itil/skills/foo/SKILL.md
+  git add packages/itil/skills/foo/SKILL.md
+  run run_bash_hook "git commit -m 'feat'"
+  [ "$status" -eq 0 ]
+  [[ "$output" == *"\"permissionDecision\": \"deny\""* ]]
+  [[ "$output" == *"P141"* ]]
+}
+@test "P141 Phase 2 deny: in-range changeset for DIFFERENT plugin does not cover this plugin's source (wrong-plugin)" {
+  mark_origin_at_head
+  # Commit 1: ship an @windyroad/itil changeset.
+  echo "itil source" > packages/itil/skills/foo/SKILL.md
+  printf -- '---\n"@windyroad/itil": patch\n---\nfix itil\n' > .changeset/wr-itil-p347.md
+  git add packages/itil/skills/foo/SKILL.md .changeset/wr-itil-p347.md
+  git -c commit.gpgsign=false commit --quiet -m "feat itil"
+  # Commit 2: stage voice-tone source. The in-range itil changeset must
+  # NOT satisfy the gate for a different plugin (per-plugin granularity).
+  mkdir -p packages/voice-tone/src
+  echo "voice source" > packages/voice-tone/src/x.ts
+  git add packages/voice-tone/src/x.ts
+  run run_bash_hook "git commit -m 'feat voice'"
+  [ "$status" -eq 0 ]
+  [[ "$output" == *"\"permissionDecision\": \"deny\""* ]]
+  # Deny message names the offending plugin slug — voice-tone, not itil.
+  [[ "$output" == *"voice-tone"* ]]
+}
+@test "P141 Phase 2 allow: untracked .changeset/*.md targeting plugin covers staged source" {
+  mark_origin_at_head
+  # Author the changeset to disk but DO NOT stage. Gate must still
+  # recognise it via `git ls-files --others --exclude-standard`.
+  printf -- '---\n"@windyroad/itil": minor\n---\nfeature\n' > .changeset/wr-itil-p347.md
+  echo "skill body" > packages/itil/skills/foo/SKILL.md
+  git add packages/itil/skills/foo/SKILL.md
+  run run_bash_hook "git commit -m 'feat'"
+  [ "$status" -eq 0 ]
+  [[ "$output" != *"\"permissionDecision\": \"deny\""* ]]
+  [ "${#output}" -eq 0 ]
+}
+@test "P141 Phase 2 allow: in-range changeset that was modified-not-staged still covers" {
+  mark_origin_at_head
+  # Commit 1: ship the changeset.
+  printf -- '---\n"@windyroad/itil": patch\n---\noriginal\n' > .changeset/wr-itil-p347.md
+  git add .changeset/wr-itil-p347.md
+  git -c commit.gpgsign=false commit --quiet -m "changeset"
+  # Modify the prose body (frontmatter preserved); do NOT stage the edit.
+  printf -- '---\n"@windyroad/itil": patch\n---\nedited prose\n' > .changeset/wr-itil-p347.md
+  # Stage source.
+  echo "skill body" > packages/itil/skills/foo/SKILL.md
+  git add packages/itil/skills/foo/SKILL.md
+  run run_bash_hook "git commit -m 'feat'"
+  [ "$status" -eq 0 ]
+  [[ "$output" != *"\"permissionDecision\": \"deny\""* ]]
+}
+@test "P141 Phase 2 deny: in-range changeset for plugin exists but its frontmatter targets a different plugin slug" {
+  mark_origin_at_head
+  # A changeset committed in-range whose frontmatter declares ONLY
+  # @windyroad/voice-tone, not @windyroad/itil. Staged source is itil.
+  # Check 2b must NOT match (frontmatter scan is per-plugin-slug).
+  printf -- '---\n"@windyroad/voice-tone": patch\n---\nfix voice\n' > .changeset/wr-voice-p999.md
+  git add .changeset/wr-voice-p999.md
+  git -c commit.gpgsign=false commit --quiet -m "voice changeset"
+  echo "skill body" > packages/itil/skills/foo/SKILL.md
+  git add packages/itil/skills/foo/SKILL.md
+  run run_bash_hook "git commit -m 'feat itil'"
+  [ "$status" -eq 0 ]
+  [[ "$output" == *"\"permissionDecision\": \"deny\""* ]]
+  [[ "$output" == *"itil"* ]]
+}
+@test "P141 Phase 2 allow: held-window docs/changesets-holding/*.md in-range entry also covers (ADR-042 Rule 7 composes with Phase 2)" {
+  mark_origin_at_head
+  mkdir -p docs/changesets-holding
+  printf -- '---\n"@windyroad/itil": patch\n---\nheld fix\n' > docs/changesets-holding/wr-itil-p347.md
+  git add docs/changesets-holding/wr-itil-p347.md
+  git -c commit.gpgsign=false commit --quiet -m "held changeset"
+  # Subsequent itil source commit — held entry in range covers.
+  echo "skill body" > packages/itil/skills/foo/SKILL.md
+  git add packages/itil/skills/foo/SKILL.md
+  run run_bash_hook "git commit -m 'feat'"
+  [ "$status" -eq 0 ]
+  [[ "$output" != *"\"permissionDecision\": \"deny\""* ]]
+}
+@test "P141 Phase 2: when no upstream and no origin/main ref exists, Check 2b skips silently and Phase 1 strict-deny is preserved" {
+  # No mark_origin_at_head — refs/remotes/origin/main is absent.
+  # Stage source without any changeset. Phase 1 strict-deny must fire
+  # (Check 2b returns 1 on missing base, Check 2a returns 0).
+  echo "skill body" > packages/itil/skills/foo/SKILL.md
+  git add packages/itil/skills/foo/SKILL.md
+  run run_bash_hook "git commit -m 'feat'"
+  [ "$status" -eq 0 ]
+  [[ "$output" == *"\"permissionDecision\": \"deny\""* ]]
+  [[ "$output" == *"P141"* ]]
+}

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@windyroad/itil",
-  "version": "0.41.0",
+  "version": "0.42.0",
   "description": "ITIL-aligned IT service management for Claude Code (problem, and future incident/change skills)",
   "bin": {
     "windyroad-itil": "./bin/install.mjs"

package/skills/capture-problem/SKILL.md CHANGED Viewed

@@ -28,6 +28,7 @@ This skill has **at most one classification-only AskUserQuestion (type-tag, ambi
 | Decision | Resolution |
 |----------|-----------|
 | Duplicate-check | Mechanical 3-keyword title-only grep; matches listed in report; capture proceeds regardless. False-positives are cheaper than false-negatives (P155 line 24). |
+| Hang-off arbitration (P346 Phase 3) | Mechanical pre-filter (≤5 candidates by shared ADR/RFC/SKILL/file signal) + fresh-context `wr-itil:hang-off-check` subagent dispatch (ADR-032 5th invocation pattern). Verdict-acts: `HANG_OFF: P<NNN>` halts capture + routes orchestrator to amend parent; `PROCEED_NEW` continues + appends rationale to `## Related`. AFK safe-default: ambiguous → PROCEED_NEW per subagent Rule 6 contract. JTBD-301 firewall: maintainer-side only. |
 | Priority default | Framework-policy: `3 (Medium) — Impact 3 × Likelihood 1` flagged "deferred — re-rate at next /wr-itil:review-problems". |
 | Effort default | Framework-policy: `M` flagged "deferred — re-rate at next /wr-itil:review-problems". |
 | Multi-concern split | Out of scope: capture-problem creates one ticket per invocation. Multi-concern observations route to `/wr-itil:manage-problem` (its Step 4b owns the split). |
@@ -138,7 +139,9 @@ Per ADR-060 § Phase 3 + Phase 4 in-scope amendment (2026-05-13). Fires ONLY whe
 **Phase 3 P3.1 nullable-field-conditional shape**: the JTBD-trace prompt + I12 hard-block fire on `jtbd_trace_value` nullability (absent vs present), NOT on the `type` field's value. The composite gate (`type == user-business AND jtbd_trace_value == empty`) treats `type` as upstream-determined co-incident input — exactly the carve-out permitted by ADR-060 line 536. Steps 2-7 below execute identically regardless of `type_value`, `jtbd_trace_value`, or `persona_value`; only the values substituted into the Step 4 skeleton template differ. This preserves I2 control-flow uniformity AND extends the I2 behavioural test (per ADR-060 Confirmation criterion 11) to assert no control-flow branch on `persona:` field presence.
-### 2. Minimal-grep duplicate check (3-keyword title-only)
+### 2. Minimal-grep duplicate check (3-keyword title-only) + hang-off-check subagent dispatch (P346 Phase 3 amendment, 2026-05-31)
+**Sub-step 2a — title-only grep (pre-existing minimal duplicate check)**
 Extract up to **3 distinct kebab-cased non-stopword keywords** from the description. Grep the **filenames** of `docs/problems/*.md` AND `docs/problems/<state>/*.md` (NOT bodies — title-only is the conservative threshold per architect verdict on Q1; dual-tolerant per RFC-002 migration window):
@@ -153,7 +156,78 @@ The **3-keyword cap** is a hard-coded constant. Do NOT make it env-overridable
 If matches are found: list them in the final report. **Do NOT halt or branch.** Capture proceeds. The user can resolve duplicates at the next `/wr-itil:review-problems` invocation (or invoke `/wr-itil:manage-problem` directly if the duplicate-check shape needs a structured branch).
-**After the grep completes**, write the per-session create-gate marker so the `PreToolUse:Write` hook (P119) permits the subsequent Write of the new ticket file under `docs/problems/open/`. Per **P260 / ADR-050 Option C**, write it under EVERY recent candidate session SID (not just one) so a concurrent orchestrator+subprocess race cannot land the marker under the wrong UUID:
+**Sub-step 2b — hang-off-check via fresh-context subagent (P346 Phase 3; ADR-032 5th invocation pattern)**
+The 3-keyword title-only grep at sub-step 2a is conservative: it catches narrow shape-overlap on titles but misses the wider class of hang-off candidates — parent tickets where the new capture's scope belongs as an Investigation Tasks expansion / Phase N section rather than as a sibling ticket. The wrongly-captured P347 sibling of P346 on 2026-05-31 is the canonical regression: the main agent (mid-iter, with rich session context) pattern-matched the existing capture flow and missed that the new spec belonged inside P346 as Phase 3.
+Sub-step 2b adds a **mechanical pre-filter + fresh-context subagent dispatch** that closes this gap without re-introducing the main agent's session-context bias. The subagent runs in isolation (no parent-session context) and emits a structured verdict the SKILL acts on deterministically.
+**Mechanical pre-filter** — grep `docs/problems/open/*.md` + `docs/problems/verifying/*.md` BODIES for tokens shared with the description: any `ADR-NNN` reference, `RFC-NNN` reference, `JTBD-NNN` reference, SKILL path (`/wr-<plugin>:<skill>` or `packages/<plugin>/skills/<skill>/`), file path (`packages/...`, `docs/...`, `.github/...`, `bin/...`, `scripts/...`), or named feature word the description cites. Collect candidates that share **≥1** signal.
+```bash
+# Extract candidate signals from the description (post-flag-strip).
+adr_refs=$(echo "$description" | grep -oE 'ADR-[0-9]{3}' | sort -u)
+rfc_refs=$(echo "$description" | grep -oE 'RFC-[0-9]{3}' | sort -u)
+skill_refs=$(echo "$description" | grep -oE '/wr-[a-z-]+:[a-z-]+' | sort -u)
+file_refs=$(echo "$description" | grep -oE '(packages|docs|\.github|bin|scripts)/[a-zA-Z0-9_./-]+' | sort -u)
+signals="$adr_refs"$'\n'"$rfc_refs"$'\n'"$skill_refs"$'\n'"$file_refs"
+signals=$(echo "$signals" | grep -v '^$' | sort -u)
+# If no signals extractable from description, skip the dispatch entirely
+# (the title-only grep at 2a is the only duplicate check this capture gets).
+[ -z "$signals" ] && SKIP_HANG_OFF_CHECK=1
+# Otherwise: pre-filter candidates from open/ + verifying/ that share ≥1 signal.
+candidates=()
+if [ -z "$SKIP_HANG_OFF_CHECK" ]; then
+  for f in docs/problems/open/*.md docs/problems/verifying/*.md; do
+    [ -f "$f" ] || continue
+    for sig in $signals; do
+      if grep -qF "$sig" "$f"; then
+        candidates+=("$f")
+        break
+      fi
+    done
+  done
+fi
+```
+**Candidate-cap short-circuit (latency-bound per ADR-032 + JTBD-001's 60s flow budget)**: if `${#candidates[@]} -gt 5`, **skip the subagent dispatch** and record the candidate list in the captured ticket's `## Related` section for review-time re-evaluation by `/wr-itil:review-problems`. Wide candidate sets blow the lightweight-capture latency budget; the safe default is "skip + defer to cluster pass."
+**Empty-candidates short-circuit**: if `${#candidates[@]} -eq 0` (no shared signals), skip the dispatch and proceed to the marker step. The mechanical pre-filter found nothing to arbitrate.
+**JTBD-301 firewall** — sub-step 2b fires on maintainer-side `/wr-itil:capture-problem` invocations ONLY. Plugin-user-side `.github/ISSUE_TEMPLATE/problem-report.yml` MUST NOT carry an equivalent dispatch (plugin-user descriptions do not carry the same authorial intent; a plugin-user describing their friction in maintainer vocabulary could plausibly trigger a wrong-parent HANG_OFF). Triage during `/wr-itil:manage-problem` ingestion stays user-judgement per JTBD-301. Mirrors the existing Step 1.5 firewall at line 116.
+**AFK safe-default (--no-prompt / AFK propagation)**: when `--no-prompt` is set, the dispatch still fires (the subagent verdict is non-interactive by construction — no `AskUserQuestion`), and ambiguous-multi-parent cases collapse to `PROCEED_NEW` per the subagent's Rule 6 contract. This satisfies JTBD-006's "Decisions normally requiring my input are resolved using safe defaults."
+**Dispatch** — when the candidate set is non-empty and ≤5, delegate to `wr-itil:hang-off-check` via the Agent tool with a structured input payload:
+```
+SURFACE: capture-problem-step-2b
+<new-capture>
+<description verbatim, post-flag-strip>
+</new-capture>
+<candidates>
+P<NNN1> | <title1> | <path1> | shared-signals: <signal1, signal2, ...>
+P<NNN2> | <title2> | <path2> | shared-signals: <signal1, signal3, ...>
+...
+</candidates>
+```
+The subagent reads the candidate ticket bodies in full as needed (via its own Read tool), reasons about absorb-vs-proceed, and returns one of:
+- `HANG_OFF: P<NNN>` with **Rationale**, **Signals matched**, **Where to absorb** sections.
+- `PROCEED_NEW` with **Rationale** and **Per-candidate explanation** for each surfaced candidate.
+**Act on verdict:**
+- **HANG_OFF: P<NNN>**: **halt** capture-problem. Emit a structured halt directive to the calling orchestrator agent naming (a) the parent ticket file path, (b) the new scope to amend in, (c) the `Where to absorb` directive from the subagent verdict. The orchestrator agent owns the parent-ticket edit + commit per the standard ticket-edit flow (do NOT amend the parent ticket from inside capture-problem — capture-problem creates new tickets; ticket-body amendments are manage-problem's surface). Record the hang-off decision + rationale in stderr for the audit trail.
+- **PROCEED_NEW**: continue to the marker step below. Capture the subagent's rationale + per-candidate explanation in a transient note (stderr) and append it to the new ticket's `## Related` section so the next reviewer sees what was considered. This is the audit-trail contract per ADR-026 grounding + JTBD-201 audit-trail completeness.
+**After the grep + (optional) hang-off-check completes**, write the per-session create-gate marker so the `PreToolUse:Write` hook (P119) permits the subsequent Write of the new ticket file under `docs/problems/open/`. Per **P260 / ADR-050 Option C**, write it under EVERY recent candidate session SID (not just one) so a concurrent orchestrator+subprocess race cannot land the marker under the wrong UUID:
 ```bash
 wr-itil-mark-create-gate
@@ -312,7 +386,10 @@ The two skills share the `/tmp/manage-problem-grep-${SESSION_ID}` create-gate ma
 - **P265** — the RISK_BYPASS-trailer allow-list mechanism in `readme-refresh-detect.sh` that P262's `capture-deferred-readme` token registers into.
 - **P170** (`docs/problems/known-error/170-problem-tickets-strain-as-fixes-decompose-into-multiple-coordinated-changes-need-rfc-framework.md`) — RFC framework driver; Slice 4 B7.T3 / item 8c authored the type-classification prompt at Step 1.5.
 - **P176** — agent-side I2 (no type-branching) coverage gap on the SKILL.md surface (this file's surface); descendant of P012 master harness ticket. The Step 1.5 I2 invariant guard is enforced by audit-trailed prose here per ADR-052 § Surface 2 escape-hatch contract; behavioural enforcement awaits the master harness.
-- **ADR-032** (`docs/decisions/032-governance-skill-invocation-patterns.proposed.md`) — foreground-lightweight-capture variant amendment.
+- **ADR-032** (`docs/decisions/032-governance-skill-invocation-patterns.proposed.md`) — foreground-lightweight-capture variant amendment (P155); 5th invocation pattern amendment (P346 Phase 3, 2026-05-31) codifies the hang-off-check sub-step 2b dispatch as the canonical fresh-context-subagent-as-decision-arbiter shape.
+- **P346** (`docs/problems/.../346-...md`) — backlog-flow-control master ticket; Phase 3 deliverable lands the hang-off-check dispatch at sub-step 2b above.
+- **RFC-013** (`docs/rfcs/RFC-013-...proposed.md`) — traces P346 Phases 1+2+3 per ADR-071 unconditional Problem→RFC trace.
+- **`packages/itil/agents/hang-off-check.md`** — the fresh-context subagent invoked by sub-step 2b; reads only the structured input payload; emits HANG_OFF: P<NNN> or PROCEED_NEW with rationale + signals + absorb directive.
 - **ADR-038** — progressive-disclosure pattern (SKILL.md + REFERENCE.md split).
 - **ADR-044** — decision-delegation contract; type classification is **derive-first**: silent-framework per category 4 on unambiguous-signal descriptions (the classifier IS the framework resolving the answer from observable evidence per ADR-026 grounding); taste per category 5 fallback on genuinely-ambiguous descriptions only. `--no-prompt` / `--type=<value>` are policy-authorised silent-proceed shapes per category 4 (caller-side pre-resolution). P185 re-classified Step 1.5's taxonomy position from "cat 5 unconditional ask" to "cat 4 derive-first with cat 5 fallback".
 - **P185** — `/wr-itil:capture-problem` asks a classification question it can answer itself from the description's observable evidence — inverse-P078 / P132 trap at a SKILL contract surface. The Step 1.5 derive-first refactor (lexical-signal classifier + stderr advisory) ships this fix.

package/skills/manage-problem/SKILL.md CHANGED Viewed

@@ -351,6 +351,28 @@ Before creating, search existing problems for similar issues. The user may not k
 **Search strategy**: Search problem filenames AND file content. A match on the filename (kebab-case title) or the Description/Symptoms sections counts. Cast a wide net — false positives are cheap (user chooses), but false negatives mean duplicate problems.
+#### Sub-step 2.8 — Hang-off-check via fresh-context subagent (P346 Phase 3 amendment, 2026-05-31; ADR-032 5th invocation pattern)
+The wide-net grep + AskUserQuestion at sub-steps 1-6 above handles the title/keyword-overlap class. Sub-step 2.8 closes a wider gap: parent tickets where the new problem's scope belongs absorbed as an Investigation Tasks expansion / Phase N section rather than as a sibling. The wrongly-captured P347 sibling of P346 on 2026-05-31 (now closed as duplicate-of-P346) is the canonical regression — the main agent mid-iter pattern-matched the existing capture flow under session-context bias.
+Sub-step 2.8 adds a **mechanical pre-filter + fresh-context subagent dispatch** that closes this gap without re-introducing the main agent's bias. Mirrors the `/wr-itil:capture-problem` Step 2 sub-step 2b dispatch verbatim.
+**Mechanical pre-filter** — grep `docs/problems/open/*.md` + `docs/problems/verifying/*.md` BODIES for tokens shared with the new problem's description: any `ADR-NNN` / `RFC-NNN` / `JTBD-NNN` reference, SKILL path (`/wr-<plugin>:<skill>` or `packages/<plugin>/skills/<skill>/`), or file path (`packages/...`, `docs/...`, `.github/...`, `bin/...`, `scripts/...`). Collect candidates that share ≥1 signal; cap at 5; empty or >5 → skip dispatch.
+**JTBD-301 firewall** — sub-step 2.8 fires on **maintainer-internal new-problem** captures ONLY. The dispatch MUST be skipped when manage-problem is ingesting a plugin-user-reported issue from `.github/ISSUE_TEMPLATE/problem-report.yml` (plugin-user descriptions do not carry the same authorial intent as maintainer-internal captures; a plugin-user describing their friction in maintainer vocabulary could plausibly trigger a wrong-parent HANG_OFF). When ingesting plugin-user reports, triage stays user-judgement per JTBD-301. Mirrors the existing Step 1.5 / Step 4 firewall patterns in `/wr-itil:capture-problem` (see line 116 of `packages/itil/skills/capture-problem/SKILL.md`).
+**AFK safe-default**: when `--no-prompt` is propagated, the dispatch still fires (the subagent verdict is non-interactive by construction — no `AskUserQuestion`), and ambiguous-multi-parent cases collapse to `PROCEED_NEW` per the subagent's Rule 6 contract. This satisfies JTBD-006's safe-default contract.
+**Dispatch** — delegate to `wr-itil:hang-off-check` via the Agent tool with the same structured payload shape capture-problem uses (`SURFACE: manage-problem-step-2.8`; `<new-capture>` payload; `<candidates>` payload with `P<NNN> | <title> | <path> | shared-signals: ...` per row). The subagent reads candidate bodies in full and emits:
+- `HANG_OFF: P<NNN>` with **Rationale**, **Signals matched**, **Where to absorb** → halt manage-problem's new-problem creation; route to the parent-ticket update flow (Step 6 ticket-body edit on the named parent). Record the hang-off decision + rationale in the parent ticket's Investigation Tasks bullet or `### Phase N — <name>` section per the subagent's `Where to absorb` directive. Single-commit grain preserved (the parent-ticket amendment commit IS this manage-problem invocation's commit per ADR-014).
+- `PROCEED_NEW` with **Rationale** + **Per-candidate explanation** → continue to Step 3 (ID assignment). Append the subagent's rationale + per-candidate explanation to the new ticket's `## Related` section as the audit trail per ADR-026 grounding + JTBD-201.
+**Why a subagent (not in-SKILL checks)**: the main agent is biased by session context — mid-iter, mid-work, pattern-matching existing flows ("I captured X then dispatched iter; do the same shape for Y"). A fresh subagent invocation starts clean, reads only the structured inputs, and reasons about candidate absorption without the bias. Same architectural pattern as `wr-architect:agent` / `wr-jtbd:agent` / `tdd:review-test` / `wr-risk-scorer:pipeline` — codified as ADR-032's 5th invocation pattern under the P346 amendment 2026-05-31.
+**Cross-references**: `packages/itil/agents/hang-off-check.md` (the subagent); `packages/itil/agents/test/fixtures/regression-p347-vs-p346.md` (canonical behavioural fixture); `docs/decisions/032-governance-skill-invocation-patterns.proposed.md` § Foreground fresh-context-subagent-as-decision-arbiter variant; `docs/rfcs/RFC-013-p346-backlog-flow-control-multi-phase.proposed.md` (multi-phase trace per ADR-071); `docs/problems/.../346-...md` (driver master ticket).
 **Hook contract (P119)**: writing a `.open.md` (or any `.<status>.md`) file under `docs/problems/` without first running this Step 2 grep + marker-touch is blocked by the `manage-problem-enforce-create.sh` PreToolUse hook with a `permissionDecision: deny` directing the agent back to this skill. Agents that try to bypass the skill (e.g. mid-retrospective inline capture, post-mortem wrap-up, or any "I'll just write it directly" shortcut) will hit the deny and be redirected here. Do not work around the deny by setting the marker manually — the marker exists to record that this Step 2 ran, and a marker without a grep is the audit-trail gap P119 closes.
 ### 3. For new problems: Assign the next ID
@@ -906,6 +928,8 @@ Commit the completed work per ADR-014 (governance skills commit their own work):
    - Fix implemented: `fix(<scope>): <description> (closes P<NNN>)` — include problem file changes (rename to `.verifying.md` + `## Fix Released` section) in the same commit per ADR-022
 4. If risk is above appetite: use `AskUserQuestion` to ask whether to commit anyway, remediate first, or park the work. If `AskUserQuestion` is unavailable, skip the commit and report the uncommitted state clearly (ADR-013 Rule 6 fail-safe). This applies only to the risk-above-appetite branch, not to the delegation-unavailable case above.
+**Multi-commit slice changeset discipline (P141 Phase 2)**: when a single logical fix lands across multiple ADR-014-grain commits targeting the same plugin (e.g. helper extraction in commit 1, callers wired in commit 2, SKILL note + transition in commit 3 — all `packages/<plugin>/`), author ONE changeset on the first commit in the slice. Subsequent same-plugin commits do NOT need their own changeset — the `itil-changeset-discipline.sh` hook's Check 2b recognises any `.changeset/*.md` (or held-window `docs/changesets-holding/*.md` entry) already in the unpushed slice scope (`origin/<base>..HEAD` + untracked + modified-not-staged) that targets `"@windyroad/<plugin>": <any-bump>` and allows. This eliminates the per-commit changeset ceremony that previously produced N redundant `.changeset/*.md` files for one logical release entry (changesets-action collapses bump-class at version-package time, so per-commit changesets rendered N near-identical CHANGELOG bullets for one release). Once a changeset hits `origin/<base>` (drained at release time), it no longer counts — a fresh changeset is required for the next slice. Cross-plugin coverage is NOT permitted: an `@windyroad/itil` changeset does not satisfy a `packages/voice-tone/` commit.
 ### 12. Auto-release when changesets are queued (ADR-020)
 **Skip this step if the skill is running inside an AFK orchestrator** (e.g. `/wr-itil:work-problems`). Orchestrators handle release cadence themselves per ADR-018 (Step 6.5). Detect via the presence of an orchestrator marker in the invoking prompt — look for phrases like "AFK", "work-problems", "batch-work", or the sentinel `ALL_DONE` convention. When in doubt, defer to the orchestrator by skipping this step.