npm - @windyroad/itil - Versions diffs - 0.17.2 → 0.18.0 - Mend

@windyroad/itil 0.17.2 → 0.18.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (6) hide show

package/.claude-plugin/plugin.json +1 -1
package/package.json +1 -1
package/skills/manage-incident/SKILL.md +14 -2
package/skills/manage-problem/SKILL.md +21 -2
package/skills/work-problems/SKILL.md +47 -6
package/skills/work-problems/test/work-problems-above-appetite-remediation.bats +122 -0

package/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,5 +1,5 @@
 {
   "name": "wr-itil",
-  "version": "0.17.2",
+  "version": "0.18.0",
   "description": "ITIL-aligned IT service management for Claude Code"
 }

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@windyroad/itil",
-  "version": "0.17.2",
+  "version": "0.18.0",
   "description": "ITIL-aligned IT service management for Claude Code (problem, and future incident/change skills)",
   "bin": {
     "windyroad-itil": "./bin/install.mjs"

package/skills/manage-incident/SKILL.md CHANGED Viewed

@@ -295,8 +295,20 @@ Otherwise, after the commit in step 14 lands, drain the release queue so the fix
 **Failure handling**: If `release:watch` fails (CI failure, publish failure), stop and report the failure clearly. Do not retry non-interactively — the user must intervene.
-**Above-appetite branch**: If push/release risk is above appetite, skip the drain and report: "Release skipped — risk above appetite. Run `npm run push:watch` and `npm run release:watch` manually when ready."
+**Above-appetite branch (per ADR-041)**: If push or release risk is above appetite (≥ 5/25), the skill MUST auto-apply scorer remediations in rank order until residual risk converges within appetite, OR halt the skill per ADR-041 Rule 5 if the scorer cannot produce a convergent plan. **The skill MUST NOT release above appetite under any circumstance.** The skill MUST NOT call `AskUserQuestion` as a shortcut out of the auto-apply loop.
-`push:watch` and `release:watch` are policy-authorised actions when residual risk is within appetite per RISK-POLICY.md, so no `AskUserQuestion` is required for the drain itself (ADR-013 Rule 6).
+**Auto-apply mechanism (ADR-041 Rule 2):**
+1. Parse the scorer's `RISK_REMEDIATIONS:` block.
+2. Rank by largest absolute `risk_delta` → smaller effort (S < M < L) → lower remediation ID.
+3. Classify each remediation's `description` against ADR-041 Rule 2a's closed action-class enumeration. **Today's orchestrator-supported class (ADR-041 v1)**: `move-to-holding` only. Other classes (`revert-commit`, `amend-commit`, `feature-flag`, `rollback-to-tag`) are deferred to P108 and route to Rule 5 halt.
+4. **Verification Pending carve-out (ADR-041 Rule 2b)**: skip remediations that target a commit attached to a `.verifying.md` ticket.
+5. Apply the top-ranked eligible remediation. Each auto-apply is its own commit (ADR-041 Rule 3 — non-AFK has no iteration wrapper to amend into); each commit goes through architect + JTBD + risk-scorer gates per ADR-014.
+6. Re-score via the same delegation path as step 1 above.
+7. **Loop**: within appetite → drain per the Drain action above. Still above → next remediation. Exhausted or unsupported class → Rule 5 halt.
+**Rule 5 halt (non-AFK mode)**: halt the skill. Emit the terminal report naming the final `RISK_SCORES:`, the Auto-apply trail, any Verification Pending ticket IDs implicated, and a one-line scorer-gap note. The user resolves interactively.
+`push:watch` and `release:watch` are policy-authorised actions when residual risk is within appetite per RISK-POLICY.md, so no `AskUserQuestion` is required for the drain itself (ADR-013 Rule 5). Auto-apply actions under Rules 2–7 are also policy-authorised per ADR-013 Rule 5.
 $ARGUMENTS

package/skills/manage-problem/SKILL.md CHANGED Viewed

@@ -724,8 +724,27 @@ Otherwise, after the commit in step 11 lands, drain the release queue so the fix
 **Failure handling**: If `release:watch` fails (CI failure, publish failure), stop and report the failure clearly. Do not retry non-interactively — the user must intervene.
-**Above-appetite branch**: If push/release risk is above appetite, skip the drain and report: "Release skipped — risk above appetite. Run `npm run push:watch` and `npm run release:watch` manually when ready."
+**Above-appetite branch (per ADR-041)**: If push or release risk is above appetite (≥ 5/25), the skill MUST auto-apply scorer remediations in rank order until residual risk converges within appetite, OR halt the skill per ADR-041 Rule 5 if the scorer cannot produce a convergent plan. **The skill MUST NOT release above appetite under any circumstance.** The skill MUST NOT call `AskUserQuestion` as a shortcut out of the auto-apply loop.
-`push:watch` and `release:watch` are policy-authorised actions when residual risk is within appetite per RISK-POLICY.md, so no `AskUserQuestion` is required for the drain itself (ADR-013 Rule 6).
+**Auto-apply mechanism (ADR-041 Rule 2):**
+1. Parse the scorer's `RISK_REMEDIATIONS:` block.
+2. Rank by largest absolute `risk_delta` → smaller effort (S < M < L) → lower remediation ID.
+3. Classify each remediation's `description` against ADR-041 Rule 2a's closed action-class enumeration. **Today's orchestrator-supported class (ADR-041 v1)**: `move-to-holding` only. Other classes (`revert-commit`, `amend-commit`, `feature-flag`, `rollback-to-tag`) are deferred to P108 and route to Rule 5 halt.
+4. **Verification Pending carve-out (ADR-041 Rule 2b)**: skip remediations that target a commit attached to a `.verifying.md` ticket. Do NOT auto-revert VP commits.
+5. Apply the top-ranked eligible remediation:
+   - `move-to-holding`: `git mv .changeset/<name>.md docs/changesets-holding/<name>.md` + append to holding-area README "Currently held" per ADR-041 Rule 6. Since the non-AFK skill has no iteration wrapper to amend into, each auto-apply is its own commit (ADR-041 Rule 3). Each commit goes through the standard ADR-014 commit flow — architect + JTBD + risk-scorer gates.
+6. Re-score via the same delegation path as step 1 above.
+7. **Loop**: re-score within appetite → drain per the Drain action above. Re-score still above → goto step 3 with remaining remediations. Exhausted or unsupported class → Rule 5 halt.
+**Rule 5 halt (non-AFK mode)**: halt the skill. Emit the terminal report naming:
+- The final `RISK_SCORES:` line
+- An "Auto-apply trail" subsection listing each remediation attempted with outcome
+- Any Verification Pending ticket IDs implicated per Rule 2b
+- A one-line scorer-gap note (e.g., "scorer produced only `move-to-holding`; residual still ≥ 5/25 after exhaustion — extend scorer vocabulary per P108")
+The user resolves interactively — typical resolutions include splitting the commit, feature-flagging the change, or opening a problem ticket documenting the scorer gap.
+`push:watch` and `release:watch` are policy-authorised actions when residual risk is within appetite per RISK-POLICY.md, so no `AskUserQuestion` is required for the drain itself (ADR-013 Rule 5). Auto-apply actions under Rules 2–7 are also policy-authorised per ADR-013 Rule 5 — `RISK-POLICY.md` appetite + ADR-041 eligibility constitute the policy.
 $ARGUMENTS

package/skills/work-problems/SKILL.md CHANGED Viewed

@@ -238,17 +238,20 @@ Format as a brief status line, not a wall of text. The user will read these when
 [Iteration 3] Skipped P016 (Multi-concern ticket splitting) — fix released, awaiting user verification. Worked P024 (Risk scorer WIP flag) — implemented fix, closed. 6 problems remain. ($1.12, 62s, 541K tokens)
 ```
-### Step 6.5: Release-cadence check (per ADR-018)
+### Step 6.5: Release-cadence check (per ADR-018, above-appetite branch per ADR-041)
-After the iteration's commit lands but before starting the next iteration, check whether the unreleased queue would push pipeline risk to or above appetite. If so, drain the queue before continuing. This prevents silent accumulation of unreleased changesets across AFK iterations (P041).
+After the iteration's commit lands but before starting the next iteration, check whether the unreleased queue would push pipeline risk to or above appetite. This prevents silent accumulation of unreleased changesets across AFK iterations (P041). **The orchestrator MUST NOT release above appetite under any circumstance** — above-appetite states route to the ADR-041 auto-apply loop or halt.
 **Mechanism — delegate, do not re-implement scoring:**
 1. Invoke the risk scorer to score cumulative pipeline state. Two paths are valid (per ADR-015):
    - **Primary**: delegate to subagent type `wr-risk-scorer:pipeline` via the Agent tool.
    - **Fallback**: if that subagent type is not available, invoke skill `/wr-risk-scorer:assess-release` via the Skill tool. The skill wraps the same pipeline subagent.
-2. Read the returned `RISK_SCORES: commit=X push=Y release=Z` line.
-3. **Threshold**: if `push` or `release` is at or above appetite (4/25, "Low" band per `RISK-POLICY.md`), drain the queue.
+2. Read the returned `RISK_SCORES: commit=X push=Y release=Z` line and the `RISK_REMEDIATIONS:` block (if present).
+3. **Classify the residual**:
+   - **Within appetite (≤ 3/25)** — no drain needed. Proceed to Step 6.75.
+   - **At appetite (= 4/25)** — drain the queue per the Drain action below, then proceed to Step 6.75.
+   - **Above appetite (≥ 5/25)** — route to the **Above-appetite branch** below. Do NOT drain. Do NOT proceed to Step 6.75 until either (a) the auto-apply loop re-converges within appetite and drain succeeds, or (b) Rule 5 halt fires.
 **Drain action (non-interactive, policy-authorised per ADR-013 Rule 6):**
@@ -258,7 +261,44 @@ After the iteration's commit lands but before starting the next iteration, check
 **Failure handling**: If `release:watch` fails (CI failure, publish failure), stop the loop and report the failure in the AFK summary. Do not retry non-interactively — the user must intervene.
-`push:watch` and `release:watch` are policy-authorised actions when residual risk is within appetite per RISK-POLICY.md, so no `AskUserQuestion` is required for the drain itself (ADR-013 Rule 6).
+`push:watch` and `release:watch` are policy-authorised actions when residual risk is within appetite per RISK-POLICY.md, so no `AskUserQuestion` is required for the drain itself (ADR-013 Rule 5).
+#### Above-appetite branch (per ADR-041)
+**Invariant**: the orchestrator MUST NOT release above appetite. There is no code path in Step 6.5 that releases at residual push/release ≥ 5/25. The orchestrator MUST NOT call `AskUserQuestion` as a shortcut out of the auto-apply loop — the scorer is the decision surface, not the user. The branch terminates in either a within-appetite drain or a Rule 5 halt.
+**Auto-apply loop (ADR-041 Rule 2):**
+1. Parse the scorer's `RISK_REMEDIATIONS:` block. Expected shape per ADR-015:
+   ```
+   RISK_REMEDIATIONS:
+   - R1 | <description> | <effort S/M/L> | <risk_delta -N> | <files affected>
+   - R2 | ...
+   ```
+2. Rank remediations by: largest absolute `risk_delta` first; tie-break by smaller effort (S < M < L); tie-break further by lower remediation ID (R1 before R2).
+3. Classify each remediation's `description` against ADR-041 Rule 2a's closed action-class enumeration. **Today's orchestrator-supported class (ADR-041 v1)**: `move-to-holding` (matched when `description` says move a changeset file to the holding area, or explicitly cites `docs/changesets-holding/`). All other classes (`revert-commit`, `amend-commit`, `feature-flag`, `rollback-to-tag`) are deferred to P108 and route to Rule 5 halt.
+4. **Verification Pending carve-out (ADR-041 Rule 2b)**: if a remediation targets a commit attached to a `.verifying.md` ticket, skip it and continue ranking. Do NOT auto-revert VP commits. If VP carve-out leaves no eligible remediations, route to Rule 5 halt naming the VP ticket(s).
+5. Apply the top-ranked eligible remediation:
+   - `move-to-holding`: `git mv .changeset/<name>.md docs/changesets-holding/<name>.md`. Append the entry to `docs/changesets-holding/README.md` under "Currently held" per ADR-041 Rule 6. Amend the iteration's commit to fold the move (per ADR-041 Rule 3 amend-based folding — preserves ADR-032 one-commit-per-iteration invariant).
+6. Re-invoke the risk scorer (same delegation path as step 1 above — subagent preferred, skill fallback). Read the new `RISK_SCORES:` line.
+7. **Loop classification**:
+   - **Re-score within appetite (≤ 4/25)** — proceed to Drain action above. Done with the above-appetite branch.
+   - **Re-score still above appetite (≥ 5/25)** — goto step 3 with the remaining ranked remediations.
+   - **No remediations remain** or **no remaining remediation classifies into Rule 2a enumeration** — Rule 5 halt.
+**Governance gates per auto-apply (ADR-041 Rule 3):** each auto-apply that requires a commit (the amend in step 5 above) goes through the standard ADR-014 commit flow — architect review, JTBD review, risk-scorer gate. A gate rejection falls through to Rule 5 halt. The scorer's ranking does NOT bypass gates.
+**Rule 5 halt (exhaustion):** when the auto-apply loop exhausts without convergence, or any gate/operation fails, halt the loop. Do NOT proceed to Step 6.75. Do NOT spawn the next iteration. Emit the iteration summary with:
+- `outcome: halted-above-appetite`
+- The final `RISK_SCORES:` line
+- An "Auto-apply trail" subsection listing each remediation attempted with outcome
+- Any Verification Pending ticket IDs implicated per Rule 2b
+- A one-line scorer-gap note (e.g., "scorer produced only `move-to-holding` remediations; residual still ≥ 5/25 after exhaustion — extend scorer vocabulary per P108")
+Halt is a **bug signal** — the scorer should always have progressively more aggressive remediations available once P108 lands. Until then, exhaustion is expected when the only path to within-appetite requires a non-`move-to-holding` class.
+**Audit trail (ADR-041 Rule 6):** append one line per auto-apply to the iteration summary's Auto-apply trail subsection, including remediation ID, action class, pre/post scores, action taken, and description citation. For `move-to-holding` actions, also append to `docs/changesets-holding/README.md` "Currently held".
 ### Step 6.75: Inter-iteration verification (P036)
@@ -296,7 +336,8 @@ When `AskUserQuestion` is unavailable or the user is AFK, the skill (and the del
 | Scope expansion during work | Update problem file, re-score WSJF, move to next problem instead of continuing |
 | Commit when risk within appetite | Auto-commit (manage-problem step 9e fallback) |
 | Commit when risk above appetite | Skip commit, report uncommitted state |
-| Pipeline risk at appetite (push or release >= 4/25) | Drain release queue (`push:watch` then `release:watch`) before next iteration — per ADR-018 (Step 6.5) |
+| Pipeline risk at appetite (push or release = 4/25) | Drain release queue (`push:watch` then `release:watch`) before next iteration — per ADR-018 (Step 6.5) |
+| Pipeline risk above appetite (push or release >= 5/25) | Auto-apply scorer remediations in rank order (ADR-041 Rule 2) under the closed action-class enumeration (Rule 2a). Today: `move-to-holding` supported; other classes deferred to P108. Re-score after each apply; drain when within appetite. **Never release above appetite** (ADR-041 Rule 1) — no AskUserQuestion shortcut. Halt the loop with `outcome: halted-above-appetite` if the loop exhausts without convergence (ADR-041 Rule 5). Verification Pending commits excluded from auto-revert (Rule 2b). Per ADR-041 (Step 6.5 Above-appetite branch). |
 | Origin diverged before start | Pull `--ff-only` if trivial; stop with report (`git log HEAD..origin/<base>` and reverse) if non-fast-forward — per ADR-019 (Step 0) |
 | Fix verification needed | Skip problem, add to "needs verification" list |
 | Stop-condition #2 with user-answerable skip-reasons | Emit Outstanding Design Questions table in summary (do NOT call AskUserQuestion). The persona is AFK by definition — per JTBD-006 and ADR-013 Rule 6 — so the table is the default. Interactive invocations may batch up to 4 questions through AskUserQuestion instead — per ADR-013 Rule 1 (Step 2.5). |

package/skills/work-problems/test/work-problems-above-appetite-remediation.bats ADDED Viewed

@@ -0,0 +1,122 @@
+#!/usr/bin/env bats
+# Doc-lint guard: work-problems SKILL.md must include the above-appetite
+# auto-apply + halt-on-exhaustion branch per ADR-041.
+#
+# Structural assertion — Permitted Exception to the source-grep ban (ADR-005 / P011).
+# These assertions are load-bearing-string checks on the skill specification
+# document. Per P081, structural tests are placeholders for behavioural tests
+# against P012's skill-testing harness; until that harness lands, these
+# assertions are the confirmation mechanism called out in ADR-041 Confirmation
+# criterion 2.
+#
+# Cross-reference:
+#   P103 (work-problems escalates resolved release decisions — defeats AFK)
+#   P104 (partial-progress paints release queue into corner)
+#   P108 (scorer remediation action-class vocabulary — deferred work)
+#   ADR-041 (auto-apply scorer remediations — never release above appetite)
+#   ADR-037 (skill testing strategy — contract-assertion pattern)
+#   @jtbd JTBD-006 (Progress the Backlog While I'm Away)
+setup() {
+  SKILL_DIR="$(cd "$(dirname "$BATS_TEST_FILENAME")/.." && pwd)"
+  SKILL_FILE="${SKILL_DIR}/SKILL.md"
+}
+@test "SKILL.md exists" {
+  [ -f "$SKILL_FILE" ]
+}
+@test "SKILL.md cites ADR-041 (above-appetite auto-apply)" {
+  # ADR-041 Confirmation criterion 1: source review names the ADR.
+  run grep -n "ADR-041" "$SKILL_FILE"
+  [ "$status" -eq 0 ]
+}
+@test "SKILL.md contains the never-release-above-appetite invariant (Rule 1)" {
+  # The load-bearing invariant from Rule 1. "MUST NOT release above appetite"
+  # is the phrase that anchors the policy.
+  run grep -nE "MUST NOT release above appetite" "$SKILL_FILE"
+  [ "$status" -eq 0 ]
+}
+@test "SKILL.md references RISK_REMEDIATIONS parsing contract (Rule 2)" {
+  # Rule 2 parses RISK_REMEDIATIONS from the scorer. If the string is absent,
+  # the skill does not implement the parse step.
+  run grep -n "RISK_REMEDIATIONS" "$SKILL_FILE"
+  [ "$status" -eq 0 ]
+}
+@test "SKILL.md references docs/changesets-holding/ (Rule 2a move-to-holding class)" {
+  # The one currently-implemented action class moves changesets to the holding
+  # area. The path must be named so the skill body is unambiguous about target.
+  run grep -n "docs/changesets-holding/" "$SKILL_FILE"
+  [ "$status" -eq 0 ]
+}
+@test "SKILL.md names the closed action-class enumeration (Rule 2a)" {
+  # "move-to-holding" is the single supported class today; later P108 extends.
+  # The string must appear so the enumeration is greppable.
+  run grep -n "move-to-holding" "$SKILL_FILE"
+  [ "$status" -eq 0 ]
+}
+@test "SKILL.md names P108 (deferred action-class vocabulary)" {
+  # Rule 2a defers revert-commit, amend-commit, feature-flag, rollback-to-tag
+  # to P108. Keeping the reference greppable makes the deferral auditable.
+  run grep -n "P108" "$SKILL_FILE"
+  [ "$status" -eq 0 ]
+}
+@test "SKILL.md includes the Verification Pending carve-out (Rule 2b)" {
+  # Rule 2b prevents auto-revert of commits attached to .verifying.md tickets.
+  run grep -niE "Verification Pending.*carve.out|Rule 2b|\.verifying\.md.*(skip|exclude|carve)" "$SKILL_FILE"
+  [ "$status" -eq 0 ]
+}
+@test "SKILL.md references the halt-on-exhaustion outcome (Rule 5)" {
+  # Rule 5 emits outcome: halted-above-appetite when the auto-apply loop
+  # exhausts without convergence.
+  run grep -n "halted-above-appetite" "$SKILL_FILE"
+  [ "$status" -eq 0 ]
+}
+@test "SKILL.md cites ADR-013 Rule 5 (policy-authorised silent proceed)" {
+  # Rule 1 is authorised by ADR-013 Rule 5. The citation should be explicit.
+  run grep -nE "ADR-013 Rule 5" "$SKILL_FILE"
+  [ "$status" -eq 0 ]
+}
+@test "SKILL.md references the scorer-gap halt signal" {
+  # Rule 5 treats exhaustion as a scorer-gap bug signal, not routine behaviour.
+  run grep -niE "scorer.gap|scorer vocabulary|bug signal" "$SKILL_FILE"
+  [ "$status" -eq 0 ]
+}
+@test "SKILL.md Non-Interactive Decision Making table covers above-appetite auto-apply" {
+  # The non-interactive defaults table row makes the behaviour discoverable to
+  # an AFK reader without forcing a full prose read.
+  run grep -niE "above appetite.*>= 5/25|pipeline risk above appetite|auto-apply scorer remediations" "$SKILL_FILE"
+  [ "$status" -eq 0 ]
+}
+@test "SKILL.md forbids AskUserQuestion shortcut for above-appetite" {
+  # The anti-shortcut stance is load-bearing for P103. Absent this, the skill
+  # reverts to the P103 bug. Allow optional "call "/"invoke " verb and optional
+  # backtick around the tool name (since the SKILL.md phrasing treats it as code).
+  run grep -niE "MUST NOT (call |invoke )?[\`]?AskUserQuestion" "$SKILL_FILE"
+  [ "$status" -eq 0 ]
+}
+@test "SKILL.md references the amend-based folding rule for ADR-032 compatibility (Rule 3)" {
+  # Auto-apply commits fold into the iteration's main commit via amend so
+  # ADR-032's one-commit-per-iteration invariant holds.
+  run grep -niE "amend|git commit --amend" "$SKILL_FILE"
+  [ "$status" -eq 0 ]
+}
+@test "SKILL.md references the audit-trail subsection (Rule 6)" {
+  # Rule 6 emits an Auto-apply trail subsection in the iteration summary. If
+  # the phrase is missing, audit trail is not wired through.
+  run grep -niE "Auto-apply trail|audit trail" "$SKILL_FILE"
+  [ "$status" -eq 0 ]
+}