@windyroad/itil 0.41.0 → 0.42.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,6 +1,19 @@
1
1
  {
2
2
  "description": "ITIL-aligned IT service management for Claude Code",
3
3
  "maturity": {
4
+ "agents": {
5
+ "hang-off-check": {
6
+ "band": "Experimental",
7
+ "computed_at": "2026-05-31T00:00:00Z",
8
+ "evidence": {
9
+ "breaking_change_age_days": null,
10
+ "closed_tickets_window": 0,
11
+ "days_shipped": 0,
12
+ "invocations_30d": null
13
+ },
14
+ "schema_version": "2.0"
15
+ }
16
+ },
4
17
  "band": "Experimental",
5
18
  "bootstrapping": true,
6
19
  "hooks": {
@@ -484,5 +497,5 @@
484
497
  }
485
498
  },
486
499
  "name": "wr-itil",
487
- "version": "0.41.0"
500
+ "version": "0.42.0"
488
501
  }
@@ -0,0 +1,146 @@
1
+ ---
2
+ name: hang-off-check
3
+ description: Capture-time inflow-discipline arbiter for problem tickets. Given a
4
+ new capture's description plus a filtered candidate ticket list, returns a
5
+ structured verdict — HANG_OFF P<NNN> when the new scope belongs as an
6
+ Investigation Tasks expansion / Phase N section on an existing parent ticket,
7
+ or PROCEED_NEW when no candidate absorbs the new scope. Spawned fresh from
8
+ inside /wr-itil:capture-problem and /wr-itil:manage-problem Step 2 to avoid
9
+ the calling agent's session-context bias. Read-only. Codified as ADR-032's
10
+ 5th invocation pattern under the P346 amendment.
11
+ tools:
12
+ - Read
13
+ - Glob
14
+ - Grep
15
+ model: inherit
16
+ ---
17
+
18
+ # @jtbd JTBD-001, JTBD-006, JTBD-101, JTBD-201
19
+
20
+ You are the Hang-Off Check arbiter. You decide — for a new problem-ticket capture, given a mechanically pre-filtered set of candidate parent tickets — whether the new scope belongs absorbed into an existing parent (`HANG_OFF: P<NNN>`) or genuinely deserves its own new ticket (`PROCEED_NEW`).
21
+
22
+ You are a reviewer, not an editor. You read inputs and emit a structured verdict. You do not modify any files; the calling skill acts on your verdict.
23
+
24
+ ## Driver
25
+
26
+ You exist because session-context bias on the calling main agent is structurally guaranteed: the main agent mid-iter has just been working on related artefacts and pattern-matches existing capture flows, missing hang-off opportunities (the wrongly-captured P347 sibling of P346 on 2026-05-31 is the canonical regression). Your fresh context is the fix — you read only the structured inputs and reason about candidate absorption without the bias.
27
+
28
+ Same architectural pattern as `wr-architect:agent` / `wr-jtbd:agent` / `tdd:review-test` / `wr-risk-scorer:pipeline` — codified in ADR-032 as the 5th invocation pattern (P346 amendment, 2026-05-31).
29
+
30
+ ## Your Inputs
31
+
32
+ The calling skill passes you a structured prompt containing two payloads:
33
+
34
+ 1. **New capture description** — the free-text observation the user / agent wants to capture as a new problem ticket (the leading flags stripped, kebab-title slug derivable but not yet committed).
35
+ 2. **Filtered candidate ticket list** — the result of the calling skill's mechanical pre-filter: candidates from `docs/problems/open/` + `docs/problems/verifying/` that share ≥1 signal with the description (ADR-NNN ref, SKILL path, file path, or named feature). The list is capped at 5 candidates per ADR-032's latency-bound contract; wider filtered sets short-circuit to PROCEED_NEW without invoking you.
36
+
37
+ For each candidate the calling skill passes:
38
+ - Ticket ID (`P<NNN>`)
39
+ - Title
40
+ - File path (so you can `Read` the full body when needed)
41
+ - The matching signals (which ADR / SKILL / file the pre-filter saw shared with the description)
42
+
43
+ You may `Read` candidate ticket files in full to evaluate their scope, Investigation Tasks state, and multi-phase scope sections. You may `Grep` / `Glob` to follow references the description or candidates cite. You SHOULD NOT load unrelated files; your reasoning should be grounded in the explicit input payload + the candidate ticket bodies.
44
+
45
+ ## How You Decide
46
+
47
+ For each candidate, ask: **does the new capture belong inside this candidate as scope expansion (Investigation Tasks bullet, Phase N section, or sibling-finding under the same root cause), or does it stand alone as a distinct problem?**
48
+
49
+ A new capture HANGS OFF a candidate when:
50
+
51
+ - The candidate is a **master ticket** for a multi-phase fix and the new capture is one phase / one sub-class of that work (P346's three-phase scope is the canonical example — Phase 3 work belongs inside P346 as Phase 3, not as a sibling P347).
52
+ - The candidate's `## Multi-phase scope` / `## Investigation Tasks` / `## Root Cause Analysis` section explicitly names work the new capture is doing (or is a natural extension of work the candidate names as in-scope).
53
+ - The candidate's root cause + the new capture's root cause are the same observable phenomenon, just surfaced at different times or by different signals.
54
+ - The new capture's description IS the candidate's deferred follow-up (the candidate's `## Fix Strategy` or `## Investigation Tasks` flags the work as "deferred to sibling ticket" but the sibling is actually scope expansion on this very ticket).
55
+
56
+ A new capture PROCEEDS as new when:
57
+
58
+ - The candidate's root cause is genuinely distinct from the new capture's root cause (shared keywords / shared file paths can mislead — a SKILL.md edit in capture-problem can be about three different problems with three different fix loci).
59
+ - The candidate is in Verifying lifecycle and the new capture is post-verifying-close discovery (the candidate is shipping its fix; the new capture is a fresh observation that needs its own intake).
60
+ - The candidate is a **sibling** to what the new capture is about (both are surfaces of a common parent that neither candidate IS) — in this case, recommend `PROCEED_NEW` and let `/wr-itil:review-problems` cluster them later.
61
+ - The new capture would force the candidate's scope to grow past its INVEST shape (single-purpose-anchor; multi-concern dilution).
62
+ - The new capture's `## Description` framing is fundamentally different from the candidate's even if surface signals overlap.
63
+
64
+ When in doubt, prefer **PROCEED_NEW** — false-negative on hang-off is cheaper than false-positive (false-positive silently swallows distinct work into the wrong parent; false-negative just defers consolidation to the next `/wr-itil:review-problems` cluster pass). This mirrors `/wr-itil:capture-problem` Step 2's existing "false-positives are cheaper than false-negatives" framing.
65
+
66
+ Under `--no-prompt` / AFK propagation, ambiguous-multi-parent cases also collapse to **PROCEED_NEW** (safe-default, no `AskUserQuestion` fallback — ADR-013 Rule 6 fail-safe per ADR-032 amendment).
67
+
68
+ ## How to Report
69
+
70
+ Emit one of two structured verdict shapes. The verdict line is parsed by the calling skill; the rationale block is preserved as the audit trail.
71
+
72
+ ### When the new capture hangs off an existing parent
73
+
74
+ ```
75
+ HANG_OFF: P<NNN>
76
+
77
+ **Rationale**: <one or two sentences naming the candidate's master-ticket / multi-phase / scope-expansion shape and why this new capture belongs inside it>.
78
+
79
+ **Signals matched**: <comma-separated list of the specific signals — e.g. "shared ADR-079 reference", "candidate's Investigation Tasks Phase 3 section names this work", "shared `packages/itil/agents/hang-off-check.md` file path", "candidate's Fix Strategy deferred this exact scope">.
80
+
81
+ **Where to absorb**: <one sentence naming where on the candidate ticket the new scope lands — e.g. "amend candidate's Investigation Tasks checklist with [the deliverable]", "expand candidate's `### Phase N — <name>` section with [the new substance]", "append to candidate's Symptoms / Workaround section">.
82
+ ```
83
+
84
+ ### When the new capture proceeds as a new ticket
85
+
86
+ ```
87
+ PROCEED_NEW
88
+
89
+ **Rationale**: <one or two sentences explaining why no candidate absorbs the new scope, even if surface signals overlap>.
90
+
91
+ **Per-candidate explanation**: for each candidate the pre-filter surfaced, one short line naming what distinguishes the new capture from that candidate (root cause / lifecycle phase / scope grain / persona / surface).
92
+ ```
93
+
94
+ ## Output Formatting
95
+
96
+ When referencing decision IDs (ADR-<NNN>), problem IDs (P<NNN>), RFC IDs (RFC-<NNN>), or JTBD IDs in prose, always include a human-readable hint on first mention. Use `P346 (review-problems backlog-flow-control master ticket)`, not bare `P346`. This matches `wr-architect:agent` / `wr-jtbd:agent` output formatting conventions per their P032 contracts.
97
+
98
+ ## Scope and Firewalls
99
+
100
+ ### Maintainer-side only (JTBD-301 firewall)
101
+
102
+ You fire on maintainer-side `/wr-itil:capture-problem` and maintainer-internal `/wr-itil:manage-problem` invocations only. You DO NOT fire on:
103
+
104
+ - Plugin-user-side intake via `.github/ISSUE_TEMPLATE/problem-report.yml` (plugin-user descriptions do not carry the same authorial intent as maintainer-internal captures — a plugin-user describing their friction in maintainer vocabulary could plausibly trigger a wrong-parent HANG_OFF). Triage during `/wr-itil:manage-problem` ingestion stays user-judgement per JTBD-301.
105
+ - `/wr-itil:manage-problem`'s ingestion-of-plugin-user-reports path (mirrors the lexical-classifier firewall at `packages/itil/skills/capture-problem/SKILL.md` line 116).
106
+
107
+ ### Cardinality
108
+
109
+ One verdict per invocation. The calling skill captures one ticket per invocation; you arbitrate the absorb-or-proceed decision for that single capture against its filtered candidate set.
110
+
111
+ ### Out of Scope
112
+
113
+ - You do NOT decide WSJF priority, effort, or any other field on the new capture or the candidate ticket. The calling skill owns those fields per its own SKILL contract.
114
+ - You do NOT amend any candidate ticket bodies. On HANG_OFF, the calling skill returns control to the orchestrator agent with a halt-and-route directive; the orchestrator amends the named candidate per its standard ticket-edit flow.
115
+ - You do NOT search for candidates the pre-filter did not surface. Your input is the pre-filtered set; if the pre-filter missed a candidate, the failure mode is wrong-PROCEED_NEW (correctable at the next `/wr-itil:review-problems` cluster pass), not silent absorption.
116
+
117
+ ## Behavioural verification
118
+
119
+ The canonical behavioural fixture is the P347-vs-P346 regression — `packages/itil/agents/test/fixtures/regression-p347-vs-p346.md` captures the input shape:
120
+ - New capture description: P347's original description (about "Phase 2 evidence shape expansion" work).
121
+ - Candidate set: contains P346 (the master backlog-flow-control ticket with its Multi-phase scope section explicitly naming Phase 2 as in-scope).
122
+ - Expected verdict: `HANG_OFF: P346` with rationale citing the shared ADR-079 reference + the candidate's Multi-phase scope section explicitly naming Phase 2.
123
+
124
+ Two further canonical fixtures live under the same path: `fixtures/proceed-new-genuinely-new.md` (no real candidates → PROCEED_NEW) and `fixtures/proceed-new-subtle-sibling.md` (P070 vs a new report-upstream surface ticket on a different SKILL → PROCEED_NEW with reasoned per-candidate rationale).
125
+
126
+ Behavioural execution of these fixtures lands under RFC-012 (promptfoo eval harness — proposed). Until RFC-012 ships, the bats fixtures at `packages/itil/agents/test/hang-off-check.bats` are structural assertions on this agent's prose contract per ADR-052 Surface 2 (with the P176 harness-gap carve-out the architect and JTBD reviewer agents document); they verify the verdict format is documented, the firewall is named, the safe-default behaviour is specified, and the fixture files exist with the expected input shape.
127
+
128
+ ## Related
129
+
130
+ - **ADR-032** (Governance-skill invocation patterns) — the 5th invocation pattern (Foreground fresh-context-subagent-as-decision-arbiter) under the P346 amendment 2026-05-31. This agent IS the worked example.
131
+ - **ADR-013** (Structured user interaction for governance decisions) — Rule 6 fail-safe; you never invoke `AskUserQuestion`; ambiguous-multi-parent collapses to PROCEED_NEW under AFK propagation.
132
+ - **ADR-026** (Agent output grounding) — your rationale MUST cite observable signals (specific ADR refs, specific file paths, specific candidate ticket section names); no qualitative claims.
133
+ - **ADR-049** (Plugin scripts via `bin/` on PATH) — not directly relevant; this agent is loaded via Agent tool, not PATH.
134
+ - **ADR-052** (Behavioural-tests default) — bats fixtures for this agent are structural per Surface 2 carve-out + P176 harness gap; behavioural eval lands under RFC-012.
135
+ - **ADR-075** (promptfoo as agent-prose verdict harness) — future home of the behavioural fixtures.
136
+ - **RFC-012** (promptfoo retrofit, proposed) — will run the canonical behavioural fixtures.
137
+ - **RFC-013** (P346 backlog flow control multi-phase, proposed) — traces P346 Phases 1+2+3; this agent is part of Phase 3's deliverable.
138
+ - **P346** (review-problems backlog-flow-control master ticket — `docs/problems/open/346-...md`) — driver ticket. Phase 3 spec authored in P346's body; codified as ADR-032's 5th pattern.
139
+ - **P347** (closed as duplicate-of-P346) — the wrongly-captured sibling that motivated this agent's existence; the canonical regression fixture.
140
+ - **P176** (agent-side I2 / harness gap) — the structural-bats Surface 2 carve-out precedent.
141
+ - **JTBD-001** (Enforce Governance Without Slowing Down) — the pre-filter latency cap + verdict-acts contract keeps capture under the 60s flow budget.
142
+ - **JTBD-006** (Progress the Backlog While I'm Away) — verdict is deterministic, never blocks on AskUserQuestion; AFK-safe.
143
+ - **JTBD-101** (Extend the Suite with New Plugins) — the fresh-context-subagent-as-decision-arbiter pattern is reusable for future capture-time discipline needs.
144
+ - **JTBD-201** (Restore Service Fast with an Audit Trail) — HANG_OFF rationale is recorded on the absorbing ticket's Investigation Tasks bullet; PROCEED_NEW rationale lands on the captured ticket's `## Related` section so the next reviewer sees what was considered.
145
+ - **`/wr-itil:capture-problem`** (`packages/itil/skills/capture-problem/SKILL.md` Step 2) — primary dispatch site.
146
+ - **`/wr-itil:manage-problem`** (`packages/itil/skills/manage-problem/SKILL.md` Step 2) — secondary dispatch site (maintainer-internal new-problem path only; plugin-user-report ingestion path skips per JTBD-301 firewall).
@@ -0,0 +1,43 @@
1
+ # Behavioural fixture 2: genuinely-new capture (PROCEED_NEW)
2
+
3
+ Validates that the agent does NOT spuriously fold a genuinely-new capture into a candidate just because the mechanical pre-filter surfaced one.
4
+
5
+ ## Input 1: New capture description
6
+
7
+ ```
8
+ The `claude plugin marketplace update` command silently caches plugin
9
+ metadata for 24 hours without surfacing the cache state in any UI. Adopters
10
+ attempting to refresh after a release wait 24h not knowing the cache is
11
+ serving stale data. Need a manual cache-bust flag or a TTL surface in the
12
+ marketplace command's output. Affects every adopter waiting on a release.
13
+ Verifiable by: invoke `claude plugin marketplace update` immediately after a
14
+ release; observe no version-bump surface; wait 24h; observe version bump.
15
+ ```
16
+
17
+ ## Input 2: Filtered candidate set
18
+
19
+ | Candidate | Title | Path | Matching signals |
20
+ |-----------|-------|------|------------------|
21
+ | P106 | `claude plugin install` is a silent no-op when already installed at any version | `docs/problems/open/106-...md` | shared `claude plugin` command surface; both touch the install/marketplace cache layer |
22
+
23
+ ## Expected verdict
24
+
25
+ ```
26
+ PROCEED_NEW
27
+
28
+ **Rationale**: P106's root cause is the `claude plugin install` command being
29
+ a no-op when already installed at any version (no version-equality check).
30
+ The new capture's root cause is a different surface: `claude plugin
31
+ marketplace update`'s opaque 24-hour metadata cache. The two share the
32
+ upstream "claude plugin marketplace cache layer" but operate at different
33
+ commands with different observable symptoms and different fix loci. Folding
34
+ the new capture into P106 would dilute P106's single-purpose anchor.
35
+
36
+ **Per-candidate explanation**:
37
+ - P106 (claude plugin install silent no-op) — different command (`install` vs
38
+ `marketplace update`); different observable (no version bump on already-
39
+ installed plugin vs no version bump because metadata cache is stale);
40
+ different fix locus (per-project install logic vs marketplace metadata
41
+ TTL/surfacing). Surface signals overlap (both touch the marketplace cache
42
+ layer) but the captured friction is genuinely distinct.
43
+ ```
@@ -0,0 +1,47 @@
1
+ # Behavioural fixture 3: subtle sibling-vs-parent (PROCEED_NEW with reasoned rationale)
2
+
3
+ Validates that the agent reasons cleanly about sibling-vs-parent shape — two tickets sharing surface signals where neither is the other's parent.
4
+
5
+ ## Input 1: New capture description
6
+
7
+ ```
8
+ `/wr-itil:report-upstream` SKILL.md surface — the report-upstream skill's
9
+ generated GitHub issue body sometimes carries our internal P<NNN> ticket IDs
10
+ verbatim instead of stripping them. Adopters reporting upstream see opaque
11
+ references they can't resolve. Should redact internal IDs before posting or
12
+ include a "for our records" footer separating internal vs upstream
13
+ references. Affects every outbound issue from /wr-itil:report-upstream.
14
+ ```
15
+
16
+ ## Input 2: Filtered candidate set
17
+
18
+ | Candidate | Title | Path | Matching signals |
19
+ |-----------|-------|------|------------------|
20
+ | P070 | `/wr-itil:report-upstream` doesn't deduplicate against existing upstream issues | `docs/problems/verifying/070-...md` | shared `/wr-itil:report-upstream` SKILL reference; both about outbound surfaces on the same SKILL |
21
+
22
+ ## Expected verdict
23
+
24
+ ```
25
+ PROCEED_NEW
26
+
27
+ **Rationale**: P070's root cause is the report-upstream SKILL not checking
28
+ the upstream repo for an existing issue before posting (deduplication gap).
29
+ The new capture's root cause is the report-upstream SKILL leaking
30
+ maintainer-internal P<NNN> references in the generated issue body (privacy /
31
+ adopter-confusion gap). The two are siblings — both surface on
32
+ `/wr-itil:report-upstream`'s outbound generation flow — but neither is the
33
+ other's parent. The shared SKILL surface is coincidental; the fix loci are
34
+ distinct (upstream-repo-state lookup vs body-text scrubbing). P070's scope
35
+ does not name internal-ID-leakage as deferred follow-up. Recommend
36
+ PROCEED_NEW; let `/wr-itil:review-problems` cluster them as sibling-class at
37
+ the next cluster pass if a common parent emerges.
38
+
39
+ **Per-candidate explanation**:
40
+ - P070 (report-upstream deduplication gap) — same SKILL surface, but
41
+ different observable (duplicate upstream issues vs leaked internal IDs in
42
+ the body), different fix locus (gh API state lookup vs string scrubbing on
43
+ the body template), different lifecycle stage (P070 is Verifying with a
44
+ shipping fix; the new capture is fresh discovery on a different code
45
+ surface in the same file). Folding would dilute P070's single-purpose
46
+ anchor and force its Verifying transition to wait on un-related work.
47
+ ```
@@ -0,0 +1,54 @@
1
+ # Behavioural fixture 1 (canonical regression): P347-vs-P346
2
+
3
+ This is the canonical regression case for the `wr-itil:hang-off-check` agent. It captures the 2026-05-31 P347 wrongly-captured-sibling-of-P346 incident. If the agent receives this fixture's inputs and returns anything other than `HANG_OFF: P346`, the SKILL is insufficient and the regression has re-opened.
4
+
5
+ Behavioural execution lands under RFC-012 (promptfoo eval harness, proposed). Until RFC-012 ships, this fixture is the documentation of the expected behaviour; the bats fixtures at `../hang-off-check.bats` are structural assertions on the agent's prose contract per ADR-052 Surface 2 carve-out.
6
+
7
+ ## Input 1: New capture description
8
+
9
+ ```
10
+ P346 + ADR-079 Phase 2 — empirical foreground relevance-scan today (5 batches,
11
+ 14 closes) revealed 4 evidence shapes Phase 1 doesn't implement, plus the
12
+ 1 shape Phase 1 does implement had the highest false-positive rate. The four
13
+ shapes: ADR-shipped-with-`human-oversight: confirmed`, named-skill-or-feature-
14
+ exists, self-marker-in-body (P289-class), driver-child-ticket-closed (P155 →
15
+ P014). Phase 1 false-positive fixes needed for state-suffix, sibling-file, and
16
+ rename detection (via `git log --follow`). Behavioural bats extension from 18
17
+ to 33 fixtures. `evaluate-relevance.sh` extension. Update
18
+ `/wr-itil:review-problems` SKILL.md Step 4.6 + `/wr-itil:manage-problem`
19
+ lifecycle table.
20
+ ```
21
+
22
+ ## Input 2: Filtered candidate set
23
+
24
+ | Candidate | Title | Path | Matching signals |
25
+ |-----------|-------|------|------------------|
26
+ | P346 | `/wr-itil:review-problems` has no path to close tickets that are no longer relevant (evidence-based, NOT age-based) — structural outflow gap drives monotonic backlog growth | `docs/problems/open/346-...md` | shared ADR-079 ref; shared `packages/itil/scripts/evaluate-relevance.sh` path; shared `/wr-itil:review-problems` SKILL ref; shared `/wr-itil:manage-problem` SKILL ref; candidate's `## Multi-phase scope` section explicitly names Phase 2 as in-scope |
27
+
28
+ ## Expected verdict
29
+
30
+ ```
31
+ HANG_OFF: P346
32
+
33
+ **Rationale**: P346 is the master ticket for the framework's backlog-flow-control
34
+ mechanisms with an explicit Multi-phase scope section. The new capture's
35
+ description IS Phase 2 of P346 (additional evidence shapes + Phase 1
36
+ false-positive fixes for the `evaluate-relevance.sh` script that P346 Phase 1
37
+ introduced). P346's body already names Phase 2 work as in-scope.
38
+
39
+ **Signals matched**: shared ADR-079 reference; shared
40
+ `packages/itil/scripts/evaluate-relevance.sh` file path; shared
41
+ `/wr-itil:review-problems` SKILL reference; shared `/wr-itil:manage-problem`
42
+ SKILL reference; candidate's `## Multi-phase scope` section explicitly names
43
+ Phase 2 as in-scope.
44
+
45
+ **Where to absorb**: amend P346's Investigation Tasks checklist with the 4
46
+ new evidence shapes + Phase 1 false-positive fix items; add a `### Phase 2 —
47
+ evidence shape expansion + Phase 1 false-positive fixes` section under the
48
+ existing Multi-phase scope section; the shipped commits' attribution
49
+ references this absorption.
50
+ ```
51
+
52
+ ## Why this fixture is canonical
53
+
54
+ The wrongly-captured P347 sibling of P346 on 2026-05-31 motivated the entire Phase 3 deliverable (this agent's existence). If a future change to the agent's prose contract regresses the verdict on this fixture, the SKILL no longer fulfils its driver. This is the binary tripwire.
@@ -0,0 +1,225 @@
1
+ #!/usr/bin/env bats
2
+ # Doc-lint guard: wr-itil:hang-off-check agent contract — the agent MUST
3
+ # carry the verdict format (HANG_OFF: P<NNN> | PROCEED_NEW), the
4
+ # fresh-context subagent rationale, the JTBD-301 maintainer-side firewall,
5
+ # the AFK safe-default (PROCEED_NEW on ambiguous), the rationale-citation
6
+ # requirement, and the three canonical fixture files. Closes P346 Phase 3
7
+ # deliverable — the SKILL's behavioural intent is documented and
8
+ # verifiable.
9
+ #
10
+ # Structural assertion — ADR-052 Surface 2 (structural-justified) +
11
+ # P176 harness gap. Behavioural execution of the three canonical fixtures
12
+ # lands under RFC-012 (promptfoo eval harness, proposed). Upgrade these
13
+ # to behavioural fixtures when RFC-012 ships.
14
+ #
15
+ # Cross-reference:
16
+ # P346 (review-problems backlog-flow-control master ticket; Phase 3
17
+ # deliverable this agent fulfils)
18
+ # P347 (closed as duplicate-of-P346; canonical regression case driving
19
+ # fixture 1)
20
+ # P176 (agent-side I2 / harness gap — Surface 2 carve-out precedent)
21
+ # ADR-032 (5th invocation pattern — fresh-context-subagent-as-decision-
22
+ # arbiter; P346 amendment 2026-05-31 codifies this agent's shape)
23
+ # ADR-052 (behavioural-tests default; Surface 2 carve-out)
24
+ # ADR-075 (promptfoo as agent-prose verdict harness — future home)
25
+ # RFC-012 (promptfoo retrofit — behavioural eval harness)
26
+ # RFC-013 (P346 multi-phase trace per ADR-071)
27
+ # @jtbd JTBD-001 (enforce governance without slowing down)
28
+ # @jtbd JTBD-006 (progress backlog while I'm away — AFK safe-default)
29
+ # @jtbd JTBD-101 (extend suite with new plugins — pattern reuse)
30
+ # @jtbd JTBD-201 (restore service fast with an audit trail — rationale)
31
+
32
+ setup() {
33
+ AGENT_DIR="$(cd "$(dirname "$BATS_TEST_FILENAME")/.." && pwd)"
34
+ AGENT_FILE="${AGENT_DIR}/hang-off-check.md"
35
+ FIXTURES_DIR="${AGENT_DIR}/test/fixtures"
36
+ }
37
+
38
+ # ----- Contract surface: verdict format + structure -----
39
+
40
+ @test "agent.md exists at packages/itil/agents/hang-off-check.md" {
41
+ [ -f "$AGENT_FILE" ]
42
+ }
43
+
44
+ @test "agent.md frontmatter declares name: hang-off-check" {
45
+ run grep -nE "^name: hang-off-check$" "$AGENT_FILE"
46
+ [ "$status" -eq 0 ]
47
+ }
48
+
49
+ @test "agent.md frontmatter limits tools to read-only set (Read, Glob, Grep)" {
50
+ # No Edit, no Write, no Bash — read-only reviewer per ADR-032 5th pattern
51
+ run grep -nE "tools:" "$AGENT_FILE"
52
+ [ "$status" -eq 0 ]
53
+ run grep -nE "^ - (Read|Glob|Grep)$" "$AGENT_FILE"
54
+ [ "$status" -eq 0 ]
55
+ # Forbid Edit / Write / Bash in the tools list
56
+ ! grep -nE "^ - (Edit|Write|Bash|MultiEdit|NotebookEdit)$" "$AGENT_FILE"
57
+ }
58
+
59
+ @test "agent.md declares HANG_OFF: P<NNN> verdict shape" {
60
+ run grep -nE 'HANG_OFF:\s*P<NNN>' "$AGENT_FILE"
61
+ [ "$status" -eq 0 ]
62
+ }
63
+
64
+ @test "agent.md declares PROCEED_NEW verdict shape" {
65
+ run grep -nE "PROCEED_NEW" "$AGENT_FILE"
66
+ [ "$status" -eq 0 ]
67
+ }
68
+
69
+ @test "agent.md requires a Rationale section on every verdict" {
70
+ run grep -nE "\*\*Rationale\*\*" "$AGENT_FILE"
71
+ [ "$status" -eq 0 ]
72
+ }
73
+
74
+ @test "agent.md requires Signals matched citation on HANG_OFF (rationale-grounding per ADR-026)" {
75
+ run grep -nE "\*\*Signals matched\*\*" "$AGENT_FILE"
76
+ [ "$status" -eq 0 ]
77
+ }
78
+
79
+ @test "agent.md requires Where to absorb directive on HANG_OFF (calling-skill action contract)" {
80
+ run grep -nE "\*\*Where to absorb\*\*" "$AGENT_FILE"
81
+ [ "$status" -eq 0 ]
82
+ }
83
+
84
+ @test "agent.md requires Per-candidate explanation on PROCEED_NEW" {
85
+ run grep -nE "\*\*Per-candidate explanation\*\*" "$AGENT_FILE"
86
+ [ "$status" -eq 0 ]
87
+ }
88
+
89
+ # ----- Driver / context-isolation rationale -----
90
+
91
+ @test "agent.md cites the fresh-context / session-context-bias driver" {
92
+ run grep -niE "(session-context bias|fresh context|context isolation)" "$AGENT_FILE"
93
+ [ "$status" -eq 0 ]
94
+ }
95
+
96
+ @test "agent.md cites the P347-vs-P346 canonical regression in the driver section" {
97
+ run grep -nE "P347" "$AGENT_FILE"
98
+ [ "$status" -eq 0 ]
99
+ run grep -nE "P346" "$AGENT_FILE"
100
+ [ "$status" -eq 0 ]
101
+ }
102
+
103
+ @test "agent.md cites ADR-032 5th-pattern codification (P346 amendment)" {
104
+ run grep -nE "ADR-032" "$AGENT_FILE"
105
+ [ "$status" -eq 0 ]
106
+ run grep -niE "5th (invocation )?pattern" "$AGENT_FILE"
107
+ [ "$status" -eq 0 ]
108
+ }
109
+
110
+ # ----- Decision rule -----
111
+
112
+ @test "agent.md names master-ticket / multi-phase as a HANG_OFF signal" {
113
+ run grep -niE "(master ticket|multi-phase|scope expansion)" "$AGENT_FILE"
114
+ [ "$status" -eq 0 ]
115
+ }
116
+
117
+ @test "agent.md names false-negative-cheaper-than-false-positive safe-default" {
118
+ run grep -niE "false[-]positive.*cheap|cheap.*false[-]positive" "$AGENT_FILE"
119
+ [ "$status" -eq 0 ]
120
+ }
121
+
122
+ @test "agent.md cites Rule 6 fail-safe / AFK safe-default (ambiguous → PROCEED_NEW)" {
123
+ run grep -niE "(ambiguous.*PROCEED_NEW|--no-prompt|AFK propagation)" "$AGENT_FILE"
124
+ [ "$status" -eq 0 ]
125
+ run grep -nE "ADR-013" "$AGENT_FILE"
126
+ [ "$status" -eq 0 ]
127
+ }
128
+
129
+ @test "agent.md explicitly forbids AskUserQuestion invocation (Rule 6)" {
130
+ run grep -niE "(never|do not|MUST NOT).*(AskUserQuestion|invoke.*AskUserQuestion)" "$AGENT_FILE"
131
+ [ "$status" -eq 0 ]
132
+ }
133
+
134
+ # ----- Scope / firewalls -----
135
+
136
+ @test "agent.md names the JTBD-301 maintainer-side firewall" {
137
+ run grep -nE "JTBD-301" "$AGENT_FILE"
138
+ [ "$status" -eq 0 ]
139
+ run grep -niE "maintainer-side|maintainer-internal" "$AGENT_FILE"
140
+ [ "$status" -eq 0 ]
141
+ }
142
+
143
+ @test "agent.md excludes plugin-user-side intake from the dispatch (firewall)" {
144
+ run grep -niE "plugin-user-side|problem-report\.yml" "$AGENT_FILE"
145
+ [ "$status" -eq 0 ]
146
+ }
147
+
148
+ @test "agent.md excludes manage-problem ingestion-of-plugin-user-reports path" {
149
+ run grep -niE "ingestion[- ]of[- ]plugin[- ]user[- ]reports" "$AGENT_FILE"
150
+ [ "$status" -eq 0 ]
151
+ }
152
+
153
+ @test "agent.md scopes cardinality to one verdict per invocation" {
154
+ run grep -niE "one verdict per invocation|cardinality" "$AGENT_FILE"
155
+ [ "$status" -eq 0 ]
156
+ }
157
+
158
+ # ----- Output formatting (matches architect / jtbd output formatting convention) -----
159
+
160
+ @test "agent.md carries Output Formatting section requiring human-readable IDs" {
161
+ run grep -nE "## Output Formatting" "$AGENT_FILE"
162
+ [ "$status" -eq 0 ]
163
+ }
164
+
165
+ # ----- JTBD annotation (per JTBD review nit 5) -----
166
+
167
+ @test "agent.md carries @jtbd annotation citing JTBD-001/006/101/201" {
168
+ run grep -nE "^# @jtbd .*JTBD-001.*JTBD-006.*JTBD-101.*JTBD-201" "$AGENT_FILE"
169
+ [ "$status" -eq 0 ]
170
+ }
171
+
172
+ # ----- Canonical behavioural fixtures (RFC-012 future-home) -----
173
+
174
+ @test "fixture 1 (canonical P347-vs-P346 regression) exists" {
175
+ [ -f "${FIXTURES_DIR}/regression-p347-vs-p346.md" ]
176
+ }
177
+
178
+ @test "fixture 1 names HANG_OFF: P346 as the expected verdict" {
179
+ run grep -nE "HANG_OFF: P346" "${FIXTURES_DIR}/regression-p347-vs-p346.md"
180
+ [ "$status" -eq 0 ]
181
+ }
182
+
183
+ @test "fixture 1 cites P347 as the wrongly-captured-sibling motivator" {
184
+ run grep -nE "P347" "${FIXTURES_DIR}/regression-p347-vs-p346.md"
185
+ [ "$status" -eq 0 ]
186
+ }
187
+
188
+ @test "fixture 2 (genuinely-new) exists" {
189
+ [ -f "${FIXTURES_DIR}/proceed-new-genuinely-new.md" ]
190
+ }
191
+
192
+ @test "fixture 2 names PROCEED_NEW as the expected verdict" {
193
+ run grep -nE "PROCEED_NEW" "${FIXTURES_DIR}/proceed-new-genuinely-new.md"
194
+ [ "$status" -eq 0 ]
195
+ }
196
+
197
+ @test "fixture 3 (subtle sibling-vs-parent) exists" {
198
+ [ -f "${FIXTURES_DIR}/proceed-new-subtle-sibling.md" ]
199
+ }
200
+
201
+ @test "fixture 3 names PROCEED_NEW with reasoned per-candidate rationale" {
202
+ run grep -nE "PROCEED_NEW" "${FIXTURES_DIR}/proceed-new-subtle-sibling.md"
203
+ [ "$status" -eq 0 ]
204
+ run grep -niE "Per-candidate explanation" "${FIXTURES_DIR}/proceed-new-subtle-sibling.md"
205
+ [ "$status" -eq 0 ]
206
+ }
207
+
208
+ # ----- RFC-012 forward-reference -----
209
+
210
+ @test "agent.md cross-references RFC-012 as future behavioural-eval home" {
211
+ run grep -nE "RFC-012" "$AGENT_FILE"
212
+ [ "$status" -eq 0 ]
213
+ }
214
+
215
+ # ----- Cross-references to dispatch sites -----
216
+
217
+ @test "agent.md cross-references /wr-itil:capture-problem as primary dispatch site" {
218
+ run grep -nE "/wr-itil:capture-problem" "$AGENT_FILE"
219
+ [ "$status" -eq 0 ]
220
+ }
221
+
222
+ @test "agent.md cross-references /wr-itil:manage-problem as secondary dispatch site" {
223
+ run grep -nE "/wr-itil:manage-problem" "$AGENT_FILE"
224
+ [ "$status" -eq 0 ]
225
+ }
@@ -31,8 +31,29 @@
31
31
  # - otherwise: publishable source — record the slug.
32
32
  # * any other path: ignored (non-publishable surface — `.github/`,
33
33
  # root config, top-level `docs/`, etc.).
34
- # - If any path is publishable source AND no valid changeset is
35
- # staged, return 1 + echo the slug.
34
+ # - If any path is publishable source:
35
+ # * **Check 2a (Phase 1)**: a `.changeset/*.md` (or held-window
36
+ # `docs/changesets-holding/*.md` per P177) staged → allow.
37
+ # * **Check 2b (Phase 2)**: an in-scope `.changeset/*.md` (or
38
+ # held-window entry) targeting the plugin via YAML frontmatter
39
+ # `"@windyroad/<slug>": <any-bump>` → allow. Scope =
40
+ # in-unpushed-range additions (`<base>..HEAD`) + untracked
41
+ # working-tree files + modified-not-staged working-tree files.
42
+ # Base = `@{u}` (current branch upstream) with fallback to
43
+ # `origin/main`. Once consumed onto origin (drained by
44
+ # changesets-action), the changeset is gone and a fresh one
45
+ # is required.
46
+ # * Neither check satisfied → return 1 + echo the slug.
47
+ #
48
+ # Phase 2 rationale (P141 2026-05-31): AFK orchestrator iters that
49
+ # ship a multi-commit slice for one plugin (e.g. P346 Phase 3 across
50
+ # 4 commits, 2 of which touched `packages/itil/`) should not author N
51
+ # redundant changesets for one logical bump. changesets-action
52
+ # collapses bump-class at version-package time, so per-commit
53
+ # changesets render N CHANGELOG bullets for one release entry. Phase
54
+ # 2's Check 2b lets the author write the changeset on the FIRST
55
+ # commit; subsequent same-plugin commits naturally allow because the
56
+ # changeset is already in the unpushed-range scope.
36
57
  #
37
58
  # Bypass:
38
59
  # - `BYPASS_CHANGESET_GATE=1` env var → return 0 (allow). For
@@ -77,14 +98,85 @@
77
98
  # shape — per-invocation deterministic, no markers).
78
99
  # P141 — this helper.
79
100
 
101
+ # P141 Phase 2 helper — does any `.changeset/*.md` (or held entry under
102
+ # `docs/changesets-holding/*.md`) ALREADY in scope target the plugin
103
+ # slug via its YAML frontmatter `"@windyroad/<slug>": <bump>` line?
104
+ #
105
+ # Scope = files reachable from HEAD but not from `origin/<base>`,
106
+ # plus untracked working-tree changesets, plus modified-not-staged
107
+ # changesets. Once a changeset is on `origin/<base>` (drained by
108
+ # changesets-action at release time), it no longer counts — Check 2b
109
+ # requires a fresh changeset for the next slice.
110
+ #
111
+ # Per-plugin granularity (NOT per-bump-class — changesets-action
112
+ # collapses bump-class at version-package time when multiple
113
+ # changesets for the same plugin merge; the published bump-class is
114
+ # the maximum across the merged set).
115
+ #
116
+ # Base resolution: prefer the current branch's upstream (`@{u}`),
117
+ # fall back to `origin/main`. If neither resolves (e.g. fresh
118
+ # repo with no remotes), Check 2b returns 1 (no in-range scope to
119
+ # inspect) — Phase 1 strict-deny behaviour is preserved.
120
+ #
121
+ # Returns: 0 (an in-scope changeset covers the plugin → allow)
122
+ # 1 (no covering changeset found → caller falls through)
123
+ _changeset_in_scope_covers_plugin() {
124
+ local slug="$1"
125
+ local base
126
+ local candidates path
127
+
128
+ base=$(git rev-parse --abbrev-ref --symbolic-full-name '@{u}' 2>/dev/null) \
129
+ || base="origin/main"
130
+ git rev-parse --verify --quiet "$base" >/dev/null 2>&1 || return 1
131
+
132
+ # Enumerate candidate changeset files:
133
+ # 1. In-range additions: changesets added in unpushed commits
134
+ # (`<base>..HEAD`). A changeset later deleted in the same
135
+ # range is filtered by the on-disk existence check below.
136
+ # 2. Untracked: changesets in the working tree not yet tracked
137
+ # by git (author wrote but did not stage).
138
+ # 3. Modified-not-staged: changesets edited since their last
139
+ # commit but not yet re-staged.
140
+ # Excludes `*/README.md` meta-docs (mirrors the staged-path branch).
141
+ candidates=$(
142
+ {
143
+ git log --diff-filter=A --name-only --pretty=format: "${base}..HEAD" \
144
+ -- '.changeset/*.md' 'docs/changesets-holding/*.md' 2>/dev/null
145
+ git ls-files --others --exclude-standard \
146
+ -- '.changeset/*.md' 'docs/changesets-holding/*.md' 2>/dev/null
147
+ git diff --name-only \
148
+ -- '.changeset/*.md' 'docs/changesets-holding/*.md' 2>/dev/null
149
+ } | grep -v '/README\.md$' | sort -u
150
+ )
151
+
152
+ [ -n "$candidates" ] || return 1
153
+
154
+ while IFS= read -r path; do
155
+ [ -n "$path" ] || continue
156
+ [ -f "$path" ] || continue
157
+ # Extract YAML frontmatter (lines between the first two `---`
158
+ # markers) and match the canonical `"@windyroad/<slug>":` line.
159
+ # awk scoping prevents false positives from prose body mentions.
160
+ if awk '/^---[[:space:]]*$/ { c++; if (c == 1) next; if (c == 2) exit } c == 1 { print }' "$path" 2>/dev/null \
161
+ | grep -qE "^\"@windyroad/${slug}\":[[:space:]]"; then
162
+ return 0
163
+ fi
164
+ done <<EOF
165
+ $candidates
166
+ EOF
167
+
168
+ return 1
169
+ }
170
+
80
171
  # Detect whether the current staged set requires a changeset that is
81
- # not staged.
172
+ # not satisfied by either staged Check 2a or in-scope Check 2b.
82
173
  #
83
174
  # Echoes the offending plugin slug on stdout when detected.
84
175
  #
85
176
  # Returns:
86
- # 0 — no change required, or BYPASS env set, or fail-open (allow)
87
- # 1 change required + no changeset staged (caller should deny)
177
+ # 0 — no change required, BYPASS env set, fail-open, or an in-scope
178
+ # changeset covers the plugin (Phase 2 Check 2b)
179
+ # 1 — change required + no covering changeset (caller should deny)
88
180
  detect_changeset_required() {
89
181
  # Bypass via env var — single most-common legitimate escape.
90
182
  if [ "${BYPASS_CHANGESET_GATE:-}" = "1" ]; then
@@ -170,10 +262,22 @@ detect_changeset_required() {
170
262
  $staged
171
263
  EOF
172
264
 
173
- if [ -n "$plugin_source_slug" ] && [ "$has_changeset" -eq 0 ]; then
174
- printf '%s\n' "$plugin_source_slug"
175
- return 1
265
+ # No publishable plugin source staged allow.
266
+ [ -n "$plugin_source_slug" ] || return 0
267
+
268
+ # Check 2a — staged changeset satisfies (Phase 1 behaviour).
269
+ if [ "$has_changeset" -eq 1 ]; then
270
+ return 0
271
+ fi
272
+
273
+ # Check 2b (P141 Phase 2) — in-scope changeset targeting the plugin
274
+ # satisfies. Scope = unpushed-range commits + untracked + modified-
275
+ # not-staged working-tree files. Once consumed onto origin, the
276
+ # changeset is gone and a fresh one is required.
277
+ if _changeset_in_scope_covers_plugin "$plugin_source_slug"; then
278
+ return 0
176
279
  fi
177
280
 
178
- return 0
281
+ printf '%s\n' "$plugin_source_slug"
282
+ return 1
179
283
  }
@@ -421,3 +421,156 @@ run_bash_hook() {
421
421
  [ "$status" -eq 0 ]
422
422
  [[ "$output" == *"\"permissionDecision\": \"deny\""* ]]
423
423
  }
424
+
425
+ # --- P141 Phase 2: in-scope-changeset coverage for multi-commit slices ---
426
+ #
427
+ # Phase 2 amendment (2026-05-31) widens the allow path: a `.changeset/*.md`
428
+ # already in the unpushed slice scope (committed in a prior unpushed commit,
429
+ # untracked in the working tree, or modified-not-staged) that targets the
430
+ # plugin via its YAML frontmatter `"@windyroad/<plugin>": <any-bump>` line
431
+ # also satisfies the gate. Phase 1 strict-deny behaviour preserved for the
432
+ # no-coverage case.
433
+ #
434
+ # Scope boundary: `<base>..HEAD` where base = `@{u}` upstream tracking branch
435
+ # with fallback to `origin/main`. Once a changeset is on `origin/<base>`
436
+ # (drained by changesets-action), it no longer counts — Phase 2 boundary
437
+ # fixture below proves this.
438
+ #
439
+ # Per-plugin granularity: an `@windyroad/itil` changeset does NOT cover a
440
+ # `packages/voice-tone/` commit — wrong-plugin negative fixture below.
441
+
442
+ # Helper: mark the current HEAD as `origin/main` so subsequent commits
443
+ # fall into the unpushed-range scope `origin/main..HEAD`. The bats setup
444
+ # creates a local repo with no remote; this synthesises an origin/main
445
+ # ref via `git update-ref` for behavioural testing.
446
+ mark_origin_at_head() {
447
+ git update-ref refs/remotes/origin/main HEAD
448
+ }
449
+
450
+ @test "P141 Phase 2 allow: in-range committed changeset for plugin covers subsequent same-plugin commit" {
451
+ mark_origin_at_head
452
+ # Commit 1: ship the changeset + initial source together (Phase 1 case).
453
+ echo "skill body 1" > packages/itil/skills/foo/SKILL.md
454
+ printf -- '---\n"@windyroad/itil": patch\n---\nfix the thing\n' > .changeset/wr-itil-p347.md
455
+ git add packages/itil/skills/foo/SKILL.md .changeset/wr-itil-p347.md
456
+ git -c commit.gpgsign=false commit --quiet -m "feat 1"
457
+ # Commit 2: stage more itil source — no new changeset, but the in-range
458
+ # changeset from commit 1 covers @windyroad/itil. Gate must allow.
459
+ echo "more skill" > packages/itil/skills/foo/SKILL.md
460
+ git add packages/itil/skills/foo/SKILL.md
461
+ run run_bash_hook "git commit -m 'feat 2'"
462
+ [ "$status" -eq 0 ]
463
+ [[ "$output" != *"\"permissionDecision\": \"deny\""* ]]
464
+ # Silent pass per ADR-045 Pattern 1.
465
+ [ "${#output}" -eq 0 ]
466
+ }
467
+
468
+ @test "P141 Phase 2 deny boundary: changeset consumed onto origin no longer counts; fresh required" {
469
+ # Commit 1: ship the changeset onto the base.
470
+ printf -- '---\n"@windyroad/itil": patch\n---\nfix the thing\n' > .changeset/wr-itil-p347.md
471
+ git add .changeset/wr-itil-p347.md
472
+ git -c commit.gpgsign=false commit --quiet -m "changeset on base"
473
+ # Mark as drained-to-origin — changesets-action consumed it at release.
474
+ mark_origin_at_head
475
+ # Remove the file as changesets-action would on consumption.
476
+ git rm --quiet .changeset/wr-itil-p347.md
477
+ git -c commit.gpgsign=false commit --quiet -m "changeset consumed on origin"
478
+ mark_origin_at_head
479
+ # Now stage fresh itil source — no in-range changeset, no staged changeset.
480
+ echo "skill body" > packages/itil/skills/foo/SKILL.md
481
+ git add packages/itil/skills/foo/SKILL.md
482
+ run run_bash_hook "git commit -m 'feat'"
483
+ [ "$status" -eq 0 ]
484
+ [[ "$output" == *"\"permissionDecision\": \"deny\""* ]]
485
+ [[ "$output" == *"P141"* ]]
486
+ }
487
+
488
+ @test "P141 Phase 2 deny: in-range changeset for DIFFERENT plugin does not cover this plugin's source (wrong-plugin)" {
489
+ mark_origin_at_head
490
+ # Commit 1: ship an @windyroad/itil changeset.
491
+ echo "itil source" > packages/itil/skills/foo/SKILL.md
492
+ printf -- '---\n"@windyroad/itil": patch\n---\nfix itil\n' > .changeset/wr-itil-p347.md
493
+ git add packages/itil/skills/foo/SKILL.md .changeset/wr-itil-p347.md
494
+ git -c commit.gpgsign=false commit --quiet -m "feat itil"
495
+ # Commit 2: stage voice-tone source. The in-range itil changeset must
496
+ # NOT satisfy the gate for a different plugin (per-plugin granularity).
497
+ mkdir -p packages/voice-tone/src
498
+ echo "voice source" > packages/voice-tone/src/x.ts
499
+ git add packages/voice-tone/src/x.ts
500
+ run run_bash_hook "git commit -m 'feat voice'"
501
+ [ "$status" -eq 0 ]
502
+ [[ "$output" == *"\"permissionDecision\": \"deny\""* ]]
503
+ # Deny message names the offending plugin slug — voice-tone, not itil.
504
+ [[ "$output" == *"voice-tone"* ]]
505
+ }
506
+
507
+ @test "P141 Phase 2 allow: untracked .changeset/*.md targeting plugin covers staged source" {
508
+ mark_origin_at_head
509
+ # Author the changeset to disk but DO NOT stage. Gate must still
510
+ # recognise it via `git ls-files --others --exclude-standard`.
511
+ printf -- '---\n"@windyroad/itil": minor\n---\nfeature\n' > .changeset/wr-itil-p347.md
512
+ echo "skill body" > packages/itil/skills/foo/SKILL.md
513
+ git add packages/itil/skills/foo/SKILL.md
514
+ run run_bash_hook "git commit -m 'feat'"
515
+ [ "$status" -eq 0 ]
516
+ [[ "$output" != *"\"permissionDecision\": \"deny\""* ]]
517
+ [ "${#output}" -eq 0 ]
518
+ }
519
+
520
+ @test "P141 Phase 2 allow: in-range changeset that was modified-not-staged still covers" {
521
+ mark_origin_at_head
522
+ # Commit 1: ship the changeset.
523
+ printf -- '---\n"@windyroad/itil": patch\n---\noriginal\n' > .changeset/wr-itil-p347.md
524
+ git add .changeset/wr-itil-p347.md
525
+ git -c commit.gpgsign=false commit --quiet -m "changeset"
526
+ # Modify the prose body (frontmatter preserved); do NOT stage the edit.
527
+ printf -- '---\n"@windyroad/itil": patch\n---\nedited prose\n' > .changeset/wr-itil-p347.md
528
+ # Stage source.
529
+ echo "skill body" > packages/itil/skills/foo/SKILL.md
530
+ git add packages/itil/skills/foo/SKILL.md
531
+ run run_bash_hook "git commit -m 'feat'"
532
+ [ "$status" -eq 0 ]
533
+ [[ "$output" != *"\"permissionDecision\": \"deny\""* ]]
534
+ }
535
+
536
+ @test "P141 Phase 2 deny: in-range changeset for plugin exists but its frontmatter targets a different plugin slug" {
537
+ mark_origin_at_head
538
+ # A changeset committed in-range whose frontmatter declares ONLY
539
+ # @windyroad/voice-tone, not @windyroad/itil. Staged source is itil.
540
+ # Check 2b must NOT match (frontmatter scan is per-plugin-slug).
541
+ printf -- '---\n"@windyroad/voice-tone": patch\n---\nfix voice\n' > .changeset/wr-voice-p999.md
542
+ git add .changeset/wr-voice-p999.md
543
+ git -c commit.gpgsign=false commit --quiet -m "voice changeset"
544
+ echo "skill body" > packages/itil/skills/foo/SKILL.md
545
+ git add packages/itil/skills/foo/SKILL.md
546
+ run run_bash_hook "git commit -m 'feat itil'"
547
+ [ "$status" -eq 0 ]
548
+ [[ "$output" == *"\"permissionDecision\": \"deny\""* ]]
549
+ [[ "$output" == *"itil"* ]]
550
+ }
551
+
552
+ @test "P141 Phase 2 allow: held-window docs/changesets-holding/*.md in-range entry also covers (ADR-042 Rule 7 composes with Phase 2)" {
553
+ mark_origin_at_head
554
+ mkdir -p docs/changesets-holding
555
+ printf -- '---\n"@windyroad/itil": patch\n---\nheld fix\n' > docs/changesets-holding/wr-itil-p347.md
556
+ git add docs/changesets-holding/wr-itil-p347.md
557
+ git -c commit.gpgsign=false commit --quiet -m "held changeset"
558
+ # Subsequent itil source commit — held entry in range covers.
559
+ echo "skill body" > packages/itil/skills/foo/SKILL.md
560
+ git add packages/itil/skills/foo/SKILL.md
561
+ run run_bash_hook "git commit -m 'feat'"
562
+ [ "$status" -eq 0 ]
563
+ [[ "$output" != *"\"permissionDecision\": \"deny\""* ]]
564
+ }
565
+
566
+ @test "P141 Phase 2: when no upstream and no origin/main ref exists, Check 2b skips silently and Phase 1 strict-deny is preserved" {
567
+ # No mark_origin_at_head — refs/remotes/origin/main is absent.
568
+ # Stage source without any changeset. Phase 1 strict-deny must fire
569
+ # (Check 2b returns 1 on missing base, Check 2a returns 0).
570
+ echo "skill body" > packages/itil/skills/foo/SKILL.md
571
+ git add packages/itil/skills/foo/SKILL.md
572
+ run run_bash_hook "git commit -m 'feat'"
573
+ [ "$status" -eq 0 ]
574
+ [[ "$output" == *"\"permissionDecision\": \"deny\""* ]]
575
+ [[ "$output" == *"P141"* ]]
576
+ }
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@windyroad/itil",
3
- "version": "0.41.0",
3
+ "version": "0.42.0",
4
4
  "description": "ITIL-aligned IT service management for Claude Code (problem, and future incident/change skills)",
5
5
  "bin": {
6
6
  "windyroad-itil": "./bin/install.mjs"
@@ -28,6 +28,7 @@ This skill has **at most one classification-only AskUserQuestion (type-tag, ambi
28
28
  | Decision | Resolution |
29
29
  |----------|-----------|
30
30
  | Duplicate-check | Mechanical 3-keyword title-only grep; matches listed in report; capture proceeds regardless. False-positives are cheaper than false-negatives (P155 line 24). |
31
+ | Hang-off arbitration (P346 Phase 3) | Mechanical pre-filter (≤5 candidates by shared ADR/RFC/SKILL/file signal) + fresh-context `wr-itil:hang-off-check` subagent dispatch (ADR-032 5th invocation pattern). Verdict-acts: `HANG_OFF: P<NNN>` halts capture + routes orchestrator to amend parent; `PROCEED_NEW` continues + appends rationale to `## Related`. AFK safe-default: ambiguous → PROCEED_NEW per subagent Rule 6 contract. JTBD-301 firewall: maintainer-side only. |
31
32
  | Priority default | Framework-policy: `3 (Medium) — Impact 3 × Likelihood 1` flagged "deferred — re-rate at next /wr-itil:review-problems". |
32
33
  | Effort default | Framework-policy: `M` flagged "deferred — re-rate at next /wr-itil:review-problems". |
33
34
  | Multi-concern split | Out of scope: capture-problem creates one ticket per invocation. Multi-concern observations route to `/wr-itil:manage-problem` (its Step 4b owns the split). |
@@ -138,7 +139,9 @@ Per ADR-060 § Phase 3 + Phase 4 in-scope amendment (2026-05-13). Fires ONLY whe
138
139
 
139
140
  **Phase 3 P3.1 nullable-field-conditional shape**: the JTBD-trace prompt + I12 hard-block fire on `jtbd_trace_value` nullability (absent vs present), NOT on the `type` field's value. The composite gate (`type == user-business AND jtbd_trace_value == empty`) treats `type` as upstream-determined co-incident input — exactly the carve-out permitted by ADR-060 line 536. Steps 2-7 below execute identically regardless of `type_value`, `jtbd_trace_value`, or `persona_value`; only the values substituted into the Step 4 skeleton template differ. This preserves I2 control-flow uniformity AND extends the I2 behavioural test (per ADR-060 Confirmation criterion 11) to assert no control-flow branch on `persona:` field presence.
140
141
 
141
- ### 2. Minimal-grep duplicate check (3-keyword title-only)
142
+ ### 2. Minimal-grep duplicate check (3-keyword title-only) + hang-off-check subagent dispatch (P346 Phase 3 amendment, 2026-05-31)
143
+
144
+ **Sub-step 2a — title-only grep (pre-existing minimal duplicate check)**
142
145
 
143
146
  Extract up to **3 distinct kebab-cased non-stopword keywords** from the description. Grep the **filenames** of `docs/problems/*.md` AND `docs/problems/<state>/*.md` (NOT bodies — title-only is the conservative threshold per architect verdict on Q1; dual-tolerant per RFC-002 migration window):
144
147
 
@@ -153,7 +156,78 @@ The **3-keyword cap** is a hard-coded constant. Do NOT make it env-overridable
153
156
 
154
157
  If matches are found: list them in the final report. **Do NOT halt or branch.** Capture proceeds. The user can resolve duplicates at the next `/wr-itil:review-problems` invocation (or invoke `/wr-itil:manage-problem` directly if the duplicate-check shape needs a structured branch).
155
158
 
156
- **After the grep completes**, write the per-session create-gate marker so the `PreToolUse:Write` hook (P119) permits the subsequent Write of the new ticket file under `docs/problems/open/`. Per **P260 / ADR-050 Option C**, write it under EVERY recent candidate session SID (not just one) so a concurrent orchestrator+subprocess race cannot land the marker under the wrong UUID:
159
+ **Sub-step 2b hang-off-check via fresh-context subagent (P346 Phase 3; ADR-032 5th invocation pattern)**
160
+
161
+ The 3-keyword title-only grep at sub-step 2a is conservative: it catches narrow shape-overlap on titles but misses the wider class of hang-off candidates — parent tickets where the new capture's scope belongs as an Investigation Tasks expansion / Phase N section rather than as a sibling ticket. The wrongly-captured P347 sibling of P346 on 2026-05-31 is the canonical regression: the main agent (mid-iter, with rich session context) pattern-matched the existing capture flow and missed that the new spec belonged inside P346 as Phase 3.
162
+
163
+ Sub-step 2b adds a **mechanical pre-filter + fresh-context subagent dispatch** that closes this gap without re-introducing the main agent's session-context bias. The subagent runs in isolation (no parent-session context) and emits a structured verdict the SKILL acts on deterministically.
164
+
165
+ **Mechanical pre-filter** — grep `docs/problems/open/*.md` + `docs/problems/verifying/*.md` BODIES for tokens shared with the description: any `ADR-NNN` reference, `RFC-NNN` reference, `JTBD-NNN` reference, SKILL path (`/wr-<plugin>:<skill>` or `packages/<plugin>/skills/<skill>/`), file path (`packages/...`, `docs/...`, `.github/...`, `bin/...`, `scripts/...`), or named feature word the description cites. Collect candidates that share **≥1** signal.
166
+
167
+ ```bash
168
+ # Extract candidate signals from the description (post-flag-strip).
169
+ adr_refs=$(echo "$description" | grep -oE 'ADR-[0-9]{3}' | sort -u)
170
+ rfc_refs=$(echo "$description" | grep -oE 'RFC-[0-9]{3}' | sort -u)
171
+ skill_refs=$(echo "$description" | grep -oE '/wr-[a-z-]+:[a-z-]+' | sort -u)
172
+ file_refs=$(echo "$description" | grep -oE '(packages|docs|\.github|bin|scripts)/[a-zA-Z0-9_./-]+' | sort -u)
173
+ signals="$adr_refs"$'\n'"$rfc_refs"$'\n'"$skill_refs"$'\n'"$file_refs"
174
+ signals=$(echo "$signals" | grep -v '^$' | sort -u)
175
+
176
+ # If no signals extractable from description, skip the dispatch entirely
177
+ # (the title-only grep at 2a is the only duplicate check this capture gets).
178
+ [ -z "$signals" ] && SKIP_HANG_OFF_CHECK=1
179
+
180
+ # Otherwise: pre-filter candidates from open/ + verifying/ that share ≥1 signal.
181
+ candidates=()
182
+ if [ -z "$SKIP_HANG_OFF_CHECK" ]; then
183
+ for f in docs/problems/open/*.md docs/problems/verifying/*.md; do
184
+ [ -f "$f" ] || continue
185
+ for sig in $signals; do
186
+ if grep -qF "$sig" "$f"; then
187
+ candidates+=("$f")
188
+ break
189
+ fi
190
+ done
191
+ done
192
+ fi
193
+ ```
194
+
195
+ **Candidate-cap short-circuit (latency-bound per ADR-032 + JTBD-001's 60s flow budget)**: if `${#candidates[@]} -gt 5`, **skip the subagent dispatch** and record the candidate list in the captured ticket's `## Related` section for review-time re-evaluation by `/wr-itil:review-problems`. Wide candidate sets blow the lightweight-capture latency budget; the safe default is "skip + defer to cluster pass."
196
+
197
+ **Empty-candidates short-circuit**: if `${#candidates[@]} -eq 0` (no shared signals), skip the dispatch and proceed to the marker step. The mechanical pre-filter found nothing to arbitrate.
198
+
199
+ **JTBD-301 firewall** — sub-step 2b fires on maintainer-side `/wr-itil:capture-problem` invocations ONLY. Plugin-user-side `.github/ISSUE_TEMPLATE/problem-report.yml` MUST NOT carry an equivalent dispatch (plugin-user descriptions do not carry the same authorial intent; a plugin-user describing their friction in maintainer vocabulary could plausibly trigger a wrong-parent HANG_OFF). Triage during `/wr-itil:manage-problem` ingestion stays user-judgement per JTBD-301. Mirrors the existing Step 1.5 firewall at line 116.
200
+
201
+ **AFK safe-default (--no-prompt / AFK propagation)**: when `--no-prompt` is set, the dispatch still fires (the subagent verdict is non-interactive by construction — no `AskUserQuestion`), and ambiguous-multi-parent cases collapse to `PROCEED_NEW` per the subagent's Rule 6 contract. This satisfies JTBD-006's "Decisions normally requiring my input are resolved using safe defaults."
202
+
203
+ **Dispatch** — when the candidate set is non-empty and ≤5, delegate to `wr-itil:hang-off-check` via the Agent tool with a structured input payload:
204
+
205
+ ```
206
+ SURFACE: capture-problem-step-2b
207
+
208
+ <new-capture>
209
+ <description verbatim, post-flag-strip>
210
+ </new-capture>
211
+
212
+ <candidates>
213
+ P<NNN1> | <title1> | <path1> | shared-signals: <signal1, signal2, ...>
214
+ P<NNN2> | <title2> | <path2> | shared-signals: <signal1, signal3, ...>
215
+ ...
216
+ </candidates>
217
+ ```
218
+
219
+ The subagent reads the candidate ticket bodies in full as needed (via its own Read tool), reasons about absorb-vs-proceed, and returns one of:
220
+
221
+ - `HANG_OFF: P<NNN>` with **Rationale**, **Signals matched**, **Where to absorb** sections.
222
+ - `PROCEED_NEW` with **Rationale** and **Per-candidate explanation** for each surfaced candidate.
223
+
224
+ **Act on verdict:**
225
+
226
+ - **HANG_OFF: P<NNN>**: **halt** capture-problem. Emit a structured halt directive to the calling orchestrator agent naming (a) the parent ticket file path, (b) the new scope to amend in, (c) the `Where to absorb` directive from the subagent verdict. The orchestrator agent owns the parent-ticket edit + commit per the standard ticket-edit flow (do NOT amend the parent ticket from inside capture-problem — capture-problem creates new tickets; ticket-body amendments are manage-problem's surface). Record the hang-off decision + rationale in stderr for the audit trail.
227
+
228
+ - **PROCEED_NEW**: continue to the marker step below. Capture the subagent's rationale + per-candidate explanation in a transient note (stderr) and append it to the new ticket's `## Related` section so the next reviewer sees what was considered. This is the audit-trail contract per ADR-026 grounding + JTBD-201 audit-trail completeness.
229
+
230
+ **After the grep + (optional) hang-off-check completes**, write the per-session create-gate marker so the `PreToolUse:Write` hook (P119) permits the subsequent Write of the new ticket file under `docs/problems/open/`. Per **P260 / ADR-050 Option C**, write it under EVERY recent candidate session SID (not just one) so a concurrent orchestrator+subprocess race cannot land the marker under the wrong UUID:
157
231
 
158
232
  ```bash
159
233
  wr-itil-mark-create-gate
@@ -312,7 +386,10 @@ The two skills share the `/tmp/manage-problem-grep-${SESSION_ID}` create-gate ma
312
386
  - **P265** — the RISK_BYPASS-trailer allow-list mechanism in `readme-refresh-detect.sh` that P262's `capture-deferred-readme` token registers into.
313
387
  - **P170** (`docs/problems/known-error/170-problem-tickets-strain-as-fixes-decompose-into-multiple-coordinated-changes-need-rfc-framework.md`) — RFC framework driver; Slice 4 B7.T3 / item 8c authored the type-classification prompt at Step 1.5.
314
388
  - **P176** — agent-side I2 (no type-branching) coverage gap on the SKILL.md surface (this file's surface); descendant of P012 master harness ticket. The Step 1.5 I2 invariant guard is enforced by audit-trailed prose here per ADR-052 § Surface 2 escape-hatch contract; behavioural enforcement awaits the master harness.
315
- - **ADR-032** (`docs/decisions/032-governance-skill-invocation-patterns.proposed.md`) — foreground-lightweight-capture variant amendment.
389
+ - **ADR-032** (`docs/decisions/032-governance-skill-invocation-patterns.proposed.md`) — foreground-lightweight-capture variant amendment (P155); 5th invocation pattern amendment (P346 Phase 3, 2026-05-31) codifies the hang-off-check sub-step 2b dispatch as the canonical fresh-context-subagent-as-decision-arbiter shape.
390
+ - **P346** (`docs/problems/.../346-...md`) — backlog-flow-control master ticket; Phase 3 deliverable lands the hang-off-check dispatch at sub-step 2b above.
391
+ - **RFC-013** (`docs/rfcs/RFC-013-...proposed.md`) — traces P346 Phases 1+2+3 per ADR-071 unconditional Problem→RFC trace.
392
+ - **`packages/itil/agents/hang-off-check.md`** — the fresh-context subagent invoked by sub-step 2b; reads only the structured input payload; emits HANG_OFF: P<NNN> or PROCEED_NEW with rationale + signals + absorb directive.
316
393
  - **ADR-038** — progressive-disclosure pattern (SKILL.md + REFERENCE.md split).
317
394
  - **ADR-044** — decision-delegation contract; type classification is **derive-first**: silent-framework per category 4 on unambiguous-signal descriptions (the classifier IS the framework resolving the answer from observable evidence per ADR-026 grounding); taste per category 5 fallback on genuinely-ambiguous descriptions only. `--no-prompt` / `--type=<value>` are policy-authorised silent-proceed shapes per category 4 (caller-side pre-resolution). P185 re-classified Step 1.5's taxonomy position from "cat 5 unconditional ask" to "cat 4 derive-first with cat 5 fallback".
318
395
  - **P185** — `/wr-itil:capture-problem` asks a classification question it can answer itself from the description's observable evidence — inverse-P078 / P132 trap at a SKILL contract surface. The Step 1.5 derive-first refactor (lexical-signal classifier + stderr advisory) ships this fix.
@@ -351,6 +351,28 @@ Before creating, search existing problems for similar issues. The user may not k
351
351
 
352
352
  **Search strategy**: Search problem filenames AND file content. A match on the filename (kebab-case title) or the Description/Symptoms sections counts. Cast a wide net — false positives are cheap (user chooses), but false negatives mean duplicate problems.
353
353
 
354
+ #### Sub-step 2.8 — Hang-off-check via fresh-context subagent (P346 Phase 3 amendment, 2026-05-31; ADR-032 5th invocation pattern)
355
+
356
+ The wide-net grep + AskUserQuestion at sub-steps 1-6 above handles the title/keyword-overlap class. Sub-step 2.8 closes a wider gap: parent tickets where the new problem's scope belongs absorbed as an Investigation Tasks expansion / Phase N section rather than as a sibling. The wrongly-captured P347 sibling of P346 on 2026-05-31 (now closed as duplicate-of-P346) is the canonical regression — the main agent mid-iter pattern-matched the existing capture flow under session-context bias.
357
+
358
+ Sub-step 2.8 adds a **mechanical pre-filter + fresh-context subagent dispatch** that closes this gap without re-introducing the main agent's bias. Mirrors the `/wr-itil:capture-problem` Step 2 sub-step 2b dispatch verbatim.
359
+
360
+ **Mechanical pre-filter** — grep `docs/problems/open/*.md` + `docs/problems/verifying/*.md` BODIES for tokens shared with the new problem's description: any `ADR-NNN` / `RFC-NNN` / `JTBD-NNN` reference, SKILL path (`/wr-<plugin>:<skill>` or `packages/<plugin>/skills/<skill>/`), or file path (`packages/...`, `docs/...`, `.github/...`, `bin/...`, `scripts/...`). Collect candidates that share ≥1 signal; cap at 5; empty or >5 → skip dispatch.
361
+
362
+ **JTBD-301 firewall** — sub-step 2.8 fires on **maintainer-internal new-problem** captures ONLY. The dispatch MUST be skipped when manage-problem is ingesting a plugin-user-reported issue from `.github/ISSUE_TEMPLATE/problem-report.yml` (plugin-user descriptions do not carry the same authorial intent as maintainer-internal captures; a plugin-user describing their friction in maintainer vocabulary could plausibly trigger a wrong-parent HANG_OFF). When ingesting plugin-user reports, triage stays user-judgement per JTBD-301. Mirrors the existing Step 1.5 / Step 4 firewall patterns in `/wr-itil:capture-problem` (see line 116 of `packages/itil/skills/capture-problem/SKILL.md`).
363
+
364
+ **AFK safe-default**: when `--no-prompt` is propagated, the dispatch still fires (the subagent verdict is non-interactive by construction — no `AskUserQuestion`), and ambiguous-multi-parent cases collapse to `PROCEED_NEW` per the subagent's Rule 6 contract. This satisfies JTBD-006's safe-default contract.
365
+
366
+ **Dispatch** — delegate to `wr-itil:hang-off-check` via the Agent tool with the same structured payload shape capture-problem uses (`SURFACE: manage-problem-step-2.8`; `<new-capture>` payload; `<candidates>` payload with `P<NNN> | <title> | <path> | shared-signals: ...` per row). The subagent reads candidate bodies in full and emits:
367
+
368
+ - `HANG_OFF: P<NNN>` with **Rationale**, **Signals matched**, **Where to absorb** → halt manage-problem's new-problem creation; route to the parent-ticket update flow (Step 6 ticket-body edit on the named parent). Record the hang-off decision + rationale in the parent ticket's Investigation Tasks bullet or `### Phase N — <name>` section per the subagent's `Where to absorb` directive. Single-commit grain preserved (the parent-ticket amendment commit IS this manage-problem invocation's commit per ADR-014).
369
+
370
+ - `PROCEED_NEW` with **Rationale** + **Per-candidate explanation** → continue to Step 3 (ID assignment). Append the subagent's rationale + per-candidate explanation to the new ticket's `## Related` section as the audit trail per ADR-026 grounding + JTBD-201.
371
+
372
+ **Why a subagent (not in-SKILL checks)**: the main agent is biased by session context — mid-iter, mid-work, pattern-matching existing flows ("I captured X then dispatched iter; do the same shape for Y"). A fresh subagent invocation starts clean, reads only the structured inputs, and reasons about candidate absorption without the bias. Same architectural pattern as `wr-architect:agent` / `wr-jtbd:agent` / `tdd:review-test` / `wr-risk-scorer:pipeline` — codified as ADR-032's 5th invocation pattern under the P346 amendment 2026-05-31.
373
+
374
+ **Cross-references**: `packages/itil/agents/hang-off-check.md` (the subagent); `packages/itil/agents/test/fixtures/regression-p347-vs-p346.md` (canonical behavioural fixture); `docs/decisions/032-governance-skill-invocation-patterns.proposed.md` § Foreground fresh-context-subagent-as-decision-arbiter variant; `docs/rfcs/RFC-013-p346-backlog-flow-control-multi-phase.proposed.md` (multi-phase trace per ADR-071); `docs/problems/.../346-...md` (driver master ticket).
375
+
354
376
  **Hook contract (P119)**: writing a `.open.md` (or any `.<status>.md`) file under `docs/problems/` without first running this Step 2 grep + marker-touch is blocked by the `manage-problem-enforce-create.sh` PreToolUse hook with a `permissionDecision: deny` directing the agent back to this skill. Agents that try to bypass the skill (e.g. mid-retrospective inline capture, post-mortem wrap-up, or any "I'll just write it directly" shortcut) will hit the deny and be redirected here. Do not work around the deny by setting the marker manually — the marker exists to record that this Step 2 ran, and a marker without a grep is the audit-trail gap P119 closes.
355
377
 
356
378
  ### 3. For new problems: Assign the next ID
@@ -906,6 +928,8 @@ Commit the completed work per ADR-014 (governance skills commit their own work):
906
928
  - Fix implemented: `fix(<scope>): <description> (closes P<NNN>)` — include problem file changes (rename to `.verifying.md` + `## Fix Released` section) in the same commit per ADR-022
907
929
  4. If risk is above appetite: use `AskUserQuestion` to ask whether to commit anyway, remediate first, or park the work. If `AskUserQuestion` is unavailable, skip the commit and report the uncommitted state clearly (ADR-013 Rule 6 fail-safe). This applies only to the risk-above-appetite branch, not to the delegation-unavailable case above.
908
930
 
931
+ **Multi-commit slice changeset discipline (P141 Phase 2)**: when a single logical fix lands across multiple ADR-014-grain commits targeting the same plugin (e.g. helper extraction in commit 1, callers wired in commit 2, SKILL note + transition in commit 3 — all `packages/<plugin>/`), author ONE changeset on the first commit in the slice. Subsequent same-plugin commits do NOT need their own changeset — the `itil-changeset-discipline.sh` hook's Check 2b recognises any `.changeset/*.md` (or held-window `docs/changesets-holding/*.md` entry) already in the unpushed slice scope (`origin/<base>..HEAD` + untracked + modified-not-staged) that targets `"@windyroad/<plugin>": <any-bump>` and allows. This eliminates the per-commit changeset ceremony that previously produced N redundant `.changeset/*.md` files for one logical release entry (changesets-action collapses bump-class at version-package time, so per-commit changesets rendered N near-identical CHANGELOG bullets for one release). Once a changeset hits `origin/<base>` (drained at release time), it no longer counts — a fresh changeset is required for the next slice. Cross-plugin coverage is NOT permitted: an `@windyroad/itil` changeset does not satisfy a `packages/voice-tone/` commit.
932
+
909
933
  ### 12. Auto-release when changesets are queued (ADR-020)
910
934
 
911
935
  **Skip this step if the skill is running inside an AFK orchestrator** (e.g. `/wr-itil:work-problems`). Orchestrators handle release cadence themselves per ADR-018 (Step 6.5). Detect via the presence of an orchestrator marker in the invoking prompt — look for phrases like "AFK", "work-problems", "batch-work", or the sentinel `ALL_DONE` convention. When in doubt, defer to the orchestrator by skipping this step.