@windyroad/itil 0.18.0-preview.185 → 0.18.1-preview.187
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude-plugin/plugin.json +1 -1
- package/package.json +1 -1
- package/skills/manage-incident/SKILL.md +7 -8
- package/skills/manage-problem/SKILL.md +16 -11
- package/skills/report-upstream/SKILL.md +1 -1
- package/skills/work-problems/SKILL.md +17 -17
- package/skills/work-problems/test/work-problems-above-appetite-remediation.bats +23 -7
package/package.json
CHANGED
|
@@ -295,17 +295,16 @@ Otherwise, after the commit in step 14 lands, drain the release queue so the fix
|
|
|
295
295
|
|
|
296
296
|
**Failure handling**: If `release:watch` fails (CI failure, publish failure), stop and report the failure clearly. Do not retry non-interactively — the user must intervene.
|
|
297
297
|
|
|
298
|
-
**Above-appetite branch (per ADR-
|
|
298
|
+
**Above-appetite branch (per ADR-042)**: If push or release risk is above appetite (≥ 5/25), the skill MUST auto-apply scorer remediations incrementally until residual risk converges within appetite, OR halt the skill per ADR-042 Rule 5 if the scorer cannot produce a convergent plan. **The skill MUST NOT release above appetite under any circumstance.** The skill MUST NOT call `AskUserQuestion` as a shortcut out of the auto-apply loop.
|
|
299
299
|
|
|
300
|
-
**Auto-apply mechanism (ADR-
|
|
300
|
+
**Auto-apply mechanism (ADR-042 Rule 2):**
|
|
301
301
|
|
|
302
302
|
1. Parse the scorer's `RISK_REMEDIATIONS:` block.
|
|
303
|
-
2.
|
|
304
|
-
3.
|
|
305
|
-
4.
|
|
306
|
-
5.
|
|
307
|
-
6.
|
|
308
|
-
7. **Loop**: within appetite → drain per the Drain action above. Still above → next remediation. Exhausted or unsupported class → Rule 5 halt.
|
|
303
|
+
2. Read the descriptions. Decide what to do. The agent MAY follow a scorer suggestion, adapt it, or do something else entirely. There is no requirement to rank all suggestions upfront or iterate through them in order.
|
|
304
|
+
3. **Verification Pending carve-out (ADR-042 Rule 2b)**: skip remediations that target a commit attached to a `.verifying.md` ticket.
|
|
305
|
+
4. Apply the chosen action using standard primitives (git, Edit, Bash). Each auto-apply is its own commit (ADR-042 Rule 3 — non-AFK has no iteration wrapper to amend into); each commit goes through architect + JTBD + risk-scorer gates per ADR-014.
|
|
306
|
+
5. Re-score via the same delegation path as step 1 above.
|
|
307
|
+
6. **Loop**: within appetite → drain per the Drain action above. Still above → continue working to reduce risk. The agent reads the new remediations and decides what to do next. Loop. Exhausted → Rule 5 halt.
|
|
309
308
|
|
|
310
309
|
**Rule 5 halt (non-AFK mode)**: halt the skill. Emit the terminal report naming the final `RISK_SCORES:`, the Auto-apply trail, any Verification Pending ticket IDs implicated, and a one-line scorer-gap note. The user resolves interactively.
|
|
311
310
|
|
|
@@ -724,18 +724,23 @@ Otherwise, after the commit in step 11 lands, drain the release queue so the fix
|
|
|
724
724
|
|
|
725
725
|
**Failure handling**: If `release:watch` fails (CI failure, publish failure), stop and report the failure clearly. Do not retry non-interactively — the user must intervene.
|
|
726
726
|
|
|
727
|
-
**Above-appetite branch (per ADR-
|
|
727
|
+
**Above-appetite branch (per ADR-042)**: If push or release risk is above appetite (≥ 5/25), the skill MUST auto-apply scorer remediations incrementally until residual risk converges within appetite, OR halt the skill per ADR-042 Rule 5 if the scorer cannot produce a convergent plan. **The skill MUST NOT release above appetite under any circumstance.** The skill MUST NOT call `AskUserQuestion` as a shortcut out of the auto-apply loop.
|
|
728
728
|
|
|
729
|
-
**Auto-apply mechanism (ADR-
|
|
729
|
+
**Auto-apply mechanism (ADR-042 Rule 2):**
|
|
730
730
|
|
|
731
|
-
1. Parse the scorer's `RISK_REMEDIATIONS:` block.
|
|
732
|
-
|
|
733
|
-
|
|
734
|
-
|
|
735
|
-
|
|
736
|
-
|
|
737
|
-
|
|
738
|
-
|
|
731
|
+
1. Parse the scorer's `RISK_REMEDIATIONS:` block. Expected shape per ADR-015 / ADR-042 Rule 2a (5 columns):
|
|
732
|
+
```
|
|
733
|
+
RISK_REMEDIATIONS:
|
|
734
|
+
- R1 | <description> | <effort S/M/L> | <risk_delta -N> | <files affected>
|
|
735
|
+
- R2 | ...
|
|
736
|
+
```
|
|
737
|
+
2. Read the descriptions. Decide what to do. The agent MAY follow a scorer suggestion, adapt it, or do something else entirely. There is no requirement to rank all suggestions upfront or iterate through them in order.
|
|
738
|
+
3. **Verification Pending carve-out (ADR-042 Rule 2b)**: skip remediations that target a commit attached to a `.verifying.md` ticket. Do NOT auto-revert VP commits.
|
|
739
|
+
4. Apply the chosen action using standard primitives (git, Edit, Bash). Example actions:
|
|
740
|
+
- `move-to-holding`: `git mv .changeset/<name>.md docs/changesets-holding/<name>.md` + append to holding-area README "Currently held" per ADR-042 Rule 6. Since the non-AFK skill has no iteration wrapper to amend into, each auto-apply is its own commit (ADR-042 Rule 3). Each commit goes through the standard ADR-014 commit flow — architect + JTBD + risk-scorer gates.
|
|
741
|
+
- `revert-commit`: `git revert --no-edit <sha>`. The scorer SHOULD supply the target commit SHA in the `description` column. Before executing, verify the SHA is NOT attached to a `.verifying.md` ticket (Rule 2b carve-out). After revert, commit the revert as a standalone auto-apply commit (no amend folding in non-AFK mode). If `git revert` produces merge conflicts, route to Rule 5 halt with the conflict detail.
|
|
742
|
+
5. Re-score via the same delegation path as step 1 above.
|
|
743
|
+
6. **Loop**: re-score within appetite → drain per the Drain action above. Re-score still above → continue working to reduce risk. The agent reads the new remediations and decides what to do next. Loop. Exhausted or unsupported class → Rule 5 halt.
|
|
739
744
|
|
|
740
745
|
**Rule 5 halt (non-AFK mode)**: halt the skill. Emit the terminal report naming:
|
|
741
746
|
- The final `RISK_SCORES:` line
|
|
@@ -745,6 +750,6 @@ Otherwise, after the commit in step 11 lands, drain the release queue so the fix
|
|
|
745
750
|
|
|
746
751
|
The user resolves interactively — typical resolutions include splitting the commit, feature-flagging the change, or opening a problem ticket documenting the scorer gap.
|
|
747
752
|
|
|
748
|
-
`push:watch` and `release:watch` are policy-authorised actions when residual risk is within appetite per RISK-POLICY.md, so no `AskUserQuestion` is required for the drain itself (ADR-013 Rule 5). Auto-apply actions under Rules 2–7 are also policy-authorised per ADR-013 Rule 5 — `RISK-POLICY.md` appetite + ADR-
|
|
753
|
+
`push:watch` and `release:watch` are policy-authorised actions when residual risk is within appetite per RISK-POLICY.md, so no `AskUserQuestion` is required for the drain itself (ADR-013 Rule 5). Auto-apply actions under Rules 2–7 are also policy-authorised per ADR-013 Rule 5 — `RISK-POLICY.md` appetite + ADR-042 eligibility constitute the policy.
|
|
749
754
|
|
|
750
755
|
$ARGUMENTS
|
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: wr-itil:report-upstream
|
|
3
3
|
description: Report a local problem ticket as a structured issue against an upstream repository, with bidirectional cross-references and SECURITY.md-aware routing for security-classified tickets. Implements the contract in ADR-024, with ADR-033 governing problem-first classifier + default body shape.
|
|
4
|
-
allowed-tools: Read, Write, Edit, Bash, Glob, Grep, AskUserQuestion
|
|
4
|
+
allowed-tools: Read, Write, Edit, Bash, Glob, Grep, AskUserQuestion, Skill, Agent
|
|
5
5
|
---
|
|
6
6
|
|
|
7
7
|
# Report Upstream — Cross-Project Problem-Reporting Skill
|
|
@@ -238,9 +238,9 @@ Format as a brief status line, not a wall of text. The user will read these when
|
|
|
238
238
|
[Iteration 3] Skipped P016 (Multi-concern ticket splitting) — fix released, awaiting user verification. Worked P024 (Risk scorer WIP flag) — implemented fix, closed. 6 problems remain. ($1.12, 62s, 541K tokens)
|
|
239
239
|
```
|
|
240
240
|
|
|
241
|
-
### Step 6.5: Release-cadence check (per ADR-018, above-appetite branch per ADR-
|
|
241
|
+
### Step 6.5: Release-cadence check (per ADR-018, above-appetite branch per ADR-042)
|
|
242
242
|
|
|
243
|
-
After the iteration's commit lands but before starting the next iteration, check whether the unreleased queue would push pipeline risk to or above appetite. This prevents silent accumulation of unreleased changesets across AFK iterations (P041). **The orchestrator MUST NOT release above appetite under any circumstance** — above-appetite states route to the ADR-
|
|
243
|
+
After the iteration's commit lands but before starting the next iteration, check whether the unreleased queue would push pipeline risk to or above appetite. This prevents silent accumulation of unreleased changesets across AFK iterations (P041). **The orchestrator MUST NOT release above appetite under any circumstance** — above-appetite states route to the ADR-042 auto-apply loop or halt.
|
|
244
244
|
|
|
245
245
|
**Mechanism — delegate, do not re-implement scoring:**
|
|
246
246
|
|
|
@@ -263,30 +263,30 @@ After the iteration's commit lands but before starting the next iteration, check
|
|
|
263
263
|
|
|
264
264
|
`push:watch` and `release:watch` are policy-authorised actions when residual risk is within appetite per RISK-POLICY.md, so no `AskUserQuestion` is required for the drain itself (ADR-013 Rule 5).
|
|
265
265
|
|
|
266
|
-
#### Above-appetite branch (per ADR-
|
|
266
|
+
#### Above-appetite branch (per ADR-042)
|
|
267
267
|
|
|
268
268
|
**Invariant**: the orchestrator MUST NOT release above appetite. There is no code path in Step 6.5 that releases at residual push/release ≥ 5/25. The orchestrator MUST NOT call `AskUserQuestion` as a shortcut out of the auto-apply loop — the scorer is the decision surface, not the user. The branch terminates in either a within-appetite drain or a Rule 5 halt.
|
|
269
269
|
|
|
270
|
-
**Auto-apply loop (ADR-
|
|
270
|
+
**Auto-apply loop (ADR-042 Rule 2):**
|
|
271
271
|
|
|
272
|
-
1. Parse the scorer's `RISK_REMEDIATIONS:` block. Expected shape per ADR-015:
|
|
272
|
+
1. Parse the scorer's `RISK_REMEDIATIONS:` block. Expected shape per ADR-015 / ADR-042 Rule 2a (5 columns):
|
|
273
273
|
```
|
|
274
274
|
RISK_REMEDIATIONS:
|
|
275
275
|
- R1 | <description> | <effort S/M/L> | <risk_delta -N> | <files affected>
|
|
276
276
|
- R2 | ...
|
|
277
277
|
```
|
|
278
|
-
2.
|
|
279
|
-
3.
|
|
280
|
-
4.
|
|
281
|
-
|
|
282
|
-
- `
|
|
283
|
-
|
|
284
|
-
|
|
278
|
+
2. Read the descriptions. Decide what to do. The agent MAY follow a scorer suggestion, adapt it, or do something else entirely. There is no requirement to rank all suggestions upfront or iterate through them in order.
|
|
279
|
+
3. **Verification Pending carve-out (ADR-042 Rule 2b)**: if a remediation targets a commit attached to a `.verifying.md` ticket, do NOT auto-revert it. Skip that suggestion and decide on the next one.
|
|
280
|
+
4. Apply the chosen action using standard primitives (git, Edit, Bash). Example actions the agent might take:
|
|
281
|
+
- `move-to-holding`: `git mv .changeset/<name>.md docs/changesets-holding/<name>.md`. Append the entry to `docs/changesets-holding/README.md` under "Currently held" per ADR-042 Rule 6. Amend the iteration's commit to fold the move (per ADR-042 Rule 3 amend-based folding — preserves ADR-032 one-commit-per-iteration invariant).
|
|
282
|
+
- `revert-commit`: `git revert --no-edit <sha>`. The scorer SHOULD supply the target commit SHA in the `description` column (e.g., "Revert commit 9a1f96c that introduced the risky gate"). Before executing, verify the SHA is NOT attached to a `.verifying.md` ticket (Rule 2b carve-out). After revert, amend the iteration's commit to fold the revert. If `git revert` produces merge conflicts, route to Rule 5 halt with the conflict detail — do not attempt non-interactive conflict resolution.
|
|
283
|
+
5. Re-invoke the risk scorer (same delegation path as step 1 above — subagent preferred, skill fallback). Read the new `RISK_SCORES:` line.
|
|
284
|
+
6. **Loop classification**:
|
|
285
285
|
- **Re-score within appetite (≤ 4/25)** — proceed to Drain action above. Done with the above-appetite branch.
|
|
286
|
-
- **Re-score still above appetite (≥ 5/25)** —
|
|
287
|
-
- **No remediations remain** or **
|
|
286
|
+
- **Re-score still above appetite (≥ 5/25)** — continue working to reduce risk. The agent reads the new remediations and decides what to do next. Loop.
|
|
287
|
+
- **No remediations remain** or **the agent has exhausted its own ideas** — Rule 5 halt.
|
|
288
288
|
|
|
289
|
-
**Governance gates per auto-apply (ADR-
|
|
289
|
+
**Governance gates per auto-apply (ADR-042 Rule 3):** each auto-apply that requires a commit (the amend in step 4 above) goes through the standard ADR-014 commit flow — architect review, JTBD review, risk-scorer gate. A gate rejection falls through to Rule 5 halt. The scorer's suggestions do NOT bypass gates.
|
|
290
290
|
|
|
291
291
|
**Rule 5 halt (exhaustion):** when the auto-apply loop exhausts without convergence, or any gate/operation fails, halt the loop. Do NOT proceed to Step 6.75. Do NOT spawn the next iteration. Emit the iteration summary with:
|
|
292
292
|
|
|
@@ -298,7 +298,7 @@ After the iteration's commit lands but before starting the next iteration, check
|
|
|
298
298
|
|
|
299
299
|
Halt is a **bug signal** — the scorer should always have progressively more aggressive remediations available once P108 lands. Until then, exhaustion is expected when the only path to within-appetite requires a non-`move-to-holding` class.
|
|
300
300
|
|
|
301
|
-
**Audit trail (ADR-
|
|
301
|
+
**Audit trail (ADR-042 Rule 6):** append one line per auto-apply to the iteration summary's Auto-apply trail subsection, including remediation ID, action class, pre/post scores, action taken, and description citation. For `move-to-holding` actions, also append to `docs/changesets-holding/README.md` "Currently held".
|
|
302
302
|
|
|
303
303
|
### Step 6.75: Inter-iteration verification (P036)
|
|
304
304
|
|
|
@@ -337,7 +337,7 @@ When `AskUserQuestion` is unavailable or the user is AFK, the skill (and the del
|
|
|
337
337
|
| Commit when risk within appetite | Auto-commit (manage-problem step 9e fallback) |
|
|
338
338
|
| Commit when risk above appetite | Skip commit, report uncommitted state |
|
|
339
339
|
| Pipeline risk at appetite (push or release = 4/25) | Drain release queue (`push:watch` then `release:watch`) before next iteration — per ADR-018 (Step 6.5) |
|
|
340
|
-
| Pipeline risk above appetite (push or release >= 5/25) | Auto-apply scorer remediations
|
|
340
|
+
| Pipeline risk above appetite (push or release >= 5/25) | Auto-apply scorer remediations incrementally (ADR-042 Rule 2). The agent reads suggestions and decides what to do. Re-score after each apply; drain when within appetite. **Never release above appetite** (ADR-042 Rule 1) — no AskUserQuestion shortcut. Halt the loop with `outcome: halted-above-appetite` if the loop exhausts without convergence (ADR-042 Rule 5). Verification Pending commits excluded from auto-revert (Rule 2b). Per ADR-042 (Step 6.5 Above-appetite branch). |
|
|
341
341
|
| Origin diverged before start | Pull `--ff-only` if trivial; stop with report (`git log HEAD..origin/<base>` and reverse) if non-fast-forward — per ADR-019 (Step 0) |
|
|
342
342
|
| Fix verification needed | Skip problem, add to "needs verification" list |
|
|
343
343
|
| Stop-condition #2 with user-answerable skip-reasons | Emit Outstanding Design Questions table in summary (do NOT call AskUserQuestion). The persona is AFK by definition — per JTBD-006 and ADR-013 Rule 6 — so the table is the default. Interactive invocations may batch up to 4 questions through AskUserQuestion instead — per ADR-013 Rule 1 (Step 2.5). |
|
|
@@ -1,19 +1,19 @@
|
|
|
1
1
|
#!/usr/bin/env bats
|
|
2
2
|
# Doc-lint guard: work-problems SKILL.md must include the above-appetite
|
|
3
|
-
# auto-apply + halt-on-exhaustion branch per ADR-
|
|
3
|
+
# auto-apply + halt-on-exhaustion branch per ADR-042.
|
|
4
4
|
#
|
|
5
5
|
# Structural assertion — Permitted Exception to the source-grep ban (ADR-005 / P011).
|
|
6
6
|
# These assertions are load-bearing-string checks on the skill specification
|
|
7
7
|
# document. Per P081, structural tests are placeholders for behavioural tests
|
|
8
8
|
# against P012's skill-testing harness; until that harness lands, these
|
|
9
|
-
# assertions are the confirmation mechanism called out in ADR-
|
|
9
|
+
# assertions are the confirmation mechanism called out in ADR-042 Confirmation
|
|
10
10
|
# criterion 2.
|
|
11
11
|
#
|
|
12
12
|
# Cross-reference:
|
|
13
13
|
# P103 (work-problems escalates resolved release decisions — defeats AFK)
|
|
14
14
|
# P104 (partial-progress paints release queue into corner)
|
|
15
15
|
# P108 (scorer remediation action-class vocabulary — deferred work)
|
|
16
|
-
# ADR-
|
|
16
|
+
# ADR-042 (auto-apply scorer remediations — open vocabulary — never release above appetite)
|
|
17
17
|
# ADR-037 (skill testing strategy — contract-assertion pattern)
|
|
18
18
|
# @jtbd JTBD-006 (Progress the Backlog While I'm Away)
|
|
19
19
|
|
|
@@ -26,9 +26,9 @@ setup() {
|
|
|
26
26
|
[ -f "$SKILL_FILE" ]
|
|
27
27
|
}
|
|
28
28
|
|
|
29
|
-
@test "SKILL.md cites ADR-
|
|
30
|
-
# ADR-
|
|
31
|
-
run grep -n "ADR-
|
|
29
|
+
@test "SKILL.md cites ADR-042 (above-appetite auto-apply)" {
|
|
30
|
+
# ADR-042 Confirmation criterion 1: source review names the ADR.
|
|
31
|
+
run grep -n "ADR-042" "$SKILL_FILE"
|
|
32
32
|
[ "$status" -eq 0 ]
|
|
33
33
|
}
|
|
34
34
|
|
|
@@ -53,7 +53,7 @@ setup() {
|
|
|
53
53
|
[ "$status" -eq 0 ]
|
|
54
54
|
}
|
|
55
55
|
|
|
56
|
-
@test "SKILL.md names the
|
|
56
|
+
@test "SKILL.md names the open action-class vocabulary (Rule 2a)" {
|
|
57
57
|
# "move-to-holding" is the single supported class today; later P108 extends.
|
|
58
58
|
# The string must appear so the enumeration is greppable.
|
|
59
59
|
run grep -n "move-to-holding" "$SKILL_FILE"
|
|
@@ -120,3 +120,19 @@ setup() {
|
|
|
120
120
|
run grep -niE "Auto-apply trail|audit trail" "$SKILL_FILE"
|
|
121
121
|
[ "$status" -eq 0 ]
|
|
122
122
|
}
|
|
123
|
+
|
|
124
|
+
# ──────────────────────────────────────────────────────────────────────────────
|
|
125
|
+
# P108: agent reads prose descriptions; no action_class column
|
|
126
|
+
# ──────────────────────────────────────────────────────────────────────────────
|
|
127
|
+
|
|
128
|
+
@test "SKILL.md has no action_class column reference (P108 — agent decides from prose)" {
|
|
129
|
+
# ADR-042 Rule 2a: no structured action_class column.
|
|
130
|
+
run grep -n "action_class" "$SKILL_FILE"
|
|
131
|
+
[ "$status" -ne 0 ]
|
|
132
|
+
}
|
|
133
|
+
|
|
134
|
+
@test "SKILL.md includes revert-commit example (P108)" {
|
|
135
|
+
# The orchestrator may choose to revert a commit based on scorer prose.
|
|
136
|
+
run grep -n "git revert" "$SKILL_FILE"
|
|
137
|
+
[ "$status" -eq 0 ]
|
|
138
|
+
}
|