@windyroad/itil 0.3.3-preview.77 → 0.4.0-preview.81

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@windyroad/itil",
3
- "version": "0.3.3-preview.77",
3
+ "version": "0.4.0-preview.81",
4
4
  "description": "ITIL-aligned IT service management for Claude Code (problem, and future incident/change skills)",
5
5
  "bin": {
6
6
  "windyroad-itil": "./bin/install.mjs"
@@ -8,6 +8,10 @@ allowed-tools: Read, Write, Edit, Bash, Glob, Grep, AskUserQuestion
8
8
 
9
9
  Create, update, or transition problem tickets following an ITIL-aligned problem management process. This skill is the authoritative definition of the problem management workflow — no separate process document is needed.
10
10
 
11
+ ## Output Formatting
12
+
13
+ When referencing problem IDs, ADR IDs, or JTBD IDs in prose output, always include the human-readable title on first mention. Use the format `P029 (Edit gate overhead for governance docs)`, not bare `P029`. Tables with separate ID and Title columns are fine as-is.
14
+
11
15
  ## Operations
12
16
 
13
17
  - **Create**: `problem <title or description>` — creates a new open problem
@@ -146,6 +150,28 @@ Do NOT ask for fields that can be inferred:
146
150
  - **Symptoms**: Infer from description if possible
147
151
  - **Workaround**: Default to "None identified yet." unless obvious from context
148
152
 
153
+ ### 4b. For new problems: Concern-boundary analysis (multi-concern check)
154
+
155
+ Before writing the problem file, perform a concern-boundary analysis on the gathered description to prevent conflated tickets that make WSJF scoring meaningless (P016).
156
+
157
+ **Self-check**: Read the description and root cause information gathered in step 4. Answer: "How many distinct root causes are present? If fixed independently, how many separate fix paths exist?"
158
+
159
+ - **Single concern** (one root cause, one fix path): proceed directly to step 5.
160
+ - **Multiple concerns** (two or more distinct root causes, different components, or if the architect review flagged this needs its own ADR): present a split prompt.
161
+
162
+ **Split prompt** — use `AskUserQuestion`:
163
+ - `header: "Multi-concern problem"`
164
+ - `multiSelect: false`
165
+ - Options:
166
+ 1. `Split into separate problems (Recommended)` — description: "Create one problem ticket per distinct concern, with consecutive IDs. Each ticket gets its own priority, WSJF score, and fix path."
167
+ 2. `Keep as a single problem` — description: "Create one ticket covering all concerns. Use this only if the concerns are so tightly coupled that they cannot be fixed independently."
168
+
169
+ **Non-interactive fallback**: When `AskUserQuestion` is unavailable (e.g., non-interactive/AFK mode), automatically split into separate problems and note the auto-split in output. Do not block creation.
170
+
171
+ **Split implementation**: When splitting, assign consecutive IDs (e.g., if next ID is 035, create P035 and P036). Create each problem file independently. Cross-reference each ticket in the other's "Related" section.
172
+
173
+ **Scope**: This step applies only to **new problem creation** (steps 2–5). It does NOT apply to updates, status transitions, or reviews of existing tickets.
174
+
149
175
  ### 5. For new problems: Write the problem file
150
176
 
151
177
  **File path**: `docs/problems/<NNN>-<kebab-case-title>.open.md`
@@ -235,19 +261,24 @@ This is a batch operation that reviews every open/known-error problem and update
235
261
 
236
262
  **Fast-path for `work` (skip full re-scan when cache is fresh):**
237
263
 
238
- Before running the full review, check whether `docs/problems/README.md` exists and is up to date:
264
+ Before running the full review, check whether `docs/problems/README.md` exists and is up to date using **git history** (not filesystem mtime, which is unreliable in worktrees and fresh checkouts — see P031):
239
265
 
240
266
  ```bash
241
- find docs/problems -name "*.md" ! -name "README.md" -newer docs/problems/README.md 2>/dev/null | head -1
267
+ readme_commit=$(git log -1 --format=%H -- docs/problems/README.md 2>/dev/null)
268
+ # Cache is stale if: no README commit, OR problem files committed since README, OR uncommitted problem file changes
269
+ if [ -z "$readme_commit" ] || \
270
+ git log --oneline "${readme_commit}..HEAD" -- 'docs/problems/*.md' ':!docs/problems/README.md' 2>/dev/null | grep -q .; then
271
+ echo "stale"
272
+ fi
242
273
  ```
243
274
 
244
- If this command produces **no output** (README.md is newer than all problem files), the cache is fresh:
275
+ If the command produces **no output** (no problem files have been committed or modified since the last README.md update), the cache is fresh:
245
276
  - Read `docs/problems/README.md` only — it contains the ranked table from the last review
246
277
  - Skip steps 9a–9b entirely
247
278
  - Proceed directly to step 9c (work selection) using the cached table
248
279
  - Note in the output: "Using cached ranking from [timestamp in README.md]"
249
280
 
250
- If the command produces output, or `README.md` does not exist, run the full review (steps 9a–9e) and refresh the cache.
281
+ If the command prints "stale", or `README.md` does not exist in git, run the full review (steps 9a–9e) and refresh the cache.
251
282
 
252
283
  **Step 9a: Read the risk framework**
253
284
 
@@ -304,7 +335,7 @@ Highlight:
304
335
 
305
336
  **Step 9d: Check for pending verifications**
306
337
 
307
- For each known-error that has a `## Fix Released` section, use `AskUserQuestion` to ask the user if the fix has been verified in production. If the user confirms, close the problem (`git mv` to `.closed.md`, update Status). If the user says no or is unsure, leave it as known-error.
338
+ For each known-error that has a `## Fix Released` section, use `AskUserQuestion` to ask the user if the fix has been verified in production. The question MUST include a fix summary extracted from the `## Fix Released` section — include the first sentence (or first bullet list) of that section in the question body or as the option description, so the user can answer without reading the full problem file. Do not ask with only the problem ID + title + version. If the user confirms, close the problem (`git mv` to `.closed.md`, update Status). If the user says no or is unsure, leave it as known-error.
308
339
 
309
340
  **Step 9e: Update files and refresh README.md cache**
310
341
 
@@ -333,9 +364,12 @@ Edit each problem file where the priority changed. Then write/overwrite `docs/pr
333
364
 
334
365
  Then commit all changed files per ADR-014:
335
366
  1. `git add` the changed problem files and `docs/problems/README.md`
336
- 2. Delegate to `wr-risk-scorer:pipeline` to assess and create a bypass marker
367
+ 2. Satisfy the commit gate two paths are valid (either produces a bypass marker):
368
+ - **Primary**: delegate to the `wr-risk-scorer:pipeline` subagent-type via the Agent tool
369
+ - **Fallback**: if the `wr-risk-scorer:pipeline` subagent-type is not available in the current tool set (e.g., this skill is itself running inside a spawned subagent), invoke the `/wr-risk-scorer:assess-release` skill via the Skill tool. Per ADR-015 it wraps the same pipeline subagent and produces an equivalent bypass marker via the `PostToolUse:Agent` hook. Do not silently skip the gate because the primary path is unavailable — the fallback exists specifically to close this gap (see P035).
337
370
  3. `git commit -m "docs(problems): review — re-rank priorities"`
338
- If `AskUserQuestion` is unavailable and risk is above appetite, skip the commit and report the uncommitted state.
371
+
372
+ If `AskUserQuestion` is unavailable and risk is above appetite, skip the commit and report the uncommitted state (ADR-013 Rule 6 fail-safe). This applies only to the risk-above-appetite branch, not to the delegation-unavailable case above.
339
373
 
340
374
  ### 10. Quality checks
341
375
 
@@ -360,13 +394,15 @@ After any operation, report:
360
394
 
361
395
  Commit the completed work per ADR-014 (governance skills commit their own work):
362
396
  1. `git add` all created/modified files for this operation
363
- 2. Delegate to `wr-risk-scorer:pipeline` (subagent_type: `wr-risk-scorer:pipeline`) to assess the staged changes and create a bypass marker
397
+ 2. Satisfy the commit gate two paths are valid (either produces a bypass marker):
398
+ - **Primary**: delegate to the `wr-risk-scorer:pipeline` subagent-type via the Agent tool (subagent_type: `wr-risk-scorer:pipeline`)
399
+ - **Fallback**: if the `wr-risk-scorer:pipeline` subagent-type is not available in the current tool set (e.g., this skill is itself running inside a spawned subagent), invoke the `/wr-risk-scorer:assess-release` skill via the Skill tool. Per ADR-015 it wraps the same pipeline subagent and the `PostToolUse:Agent` hook writes an equivalent bypass marker. Do not silently skip the gate because the primary path is unavailable — the fallback exists specifically to close this gap (see P035).
364
400
  3. `git commit -m "<message>"` using the convention for the operation type:
365
401
  - New problem: `docs(problems): open P<NNN> <title>`
366
402
  - Known Error transition: `docs(problems): P<NNN> known error — <root cause summary>`
367
403
  - Problem closed: `docs(problems): close P<NNN> <title>`
368
404
  - Review/re-rank: `docs(problems): review — re-rank priorities`
369
405
  - Fix implemented: `fix(<scope>): <description> (closes P<NNN>)` — include problem file changes in the same commit
370
- 4. If risk is above appetite: use `AskUserQuestion` to ask whether to commit anyway, remediate first, or park the work. If `AskUserQuestion` is unavailable, skip the commit and report the uncommitted state clearly.
406
+ 4. If risk is above appetite: use `AskUserQuestion` to ask whether to commit anyway, remediate first, or park the work. If `AskUserQuestion` is unavailable, skip the commit and report the uncommitted state clearly (ADR-013 Rule 6 fail-safe). This applies only to the risk-above-appetite branch, not to the delegation-unavailable case above.
371
407
 
372
408
  $ARGUMENTS
@@ -0,0 +1,64 @@
1
+ #!/usr/bin/env bats
2
+ # Doc-lint guard: manage-problem SKILL.md must include a concern-boundary
3
+ # analysis step for new problem creation.
4
+ #
5
+ # Structural assertion — Permitted Exception to the source-grep ban (ADR-005 / P011).
6
+ # These tests assert that the skill specification document conforms to the
7
+ # concern-boundary splitting contract introduced by P016.
8
+ #
9
+ # Cross-reference:
10
+ # P016: docs/problems/016-manage-problem-should-split-multi-concern-tickets.open.md
11
+ # ADR-013: docs/decisions/013-structured-user-interaction-for-governance-decisions.proposed.md
12
+ # @jtbd JTBD-001 (enforce governance without slowing down)
13
+ # @jtbd JTBD-101 (extend the suite with clear patterns)
14
+
15
+ setup() {
16
+ SKILL_DIR="$(cd "$(dirname "$BATS_TEST_FILENAME")/.." && pwd)"
17
+ SKILL_FILE="${SKILL_DIR}/SKILL.md"
18
+ }
19
+
20
+ @test "SKILL.md includes a concern-boundary analysis step for new problem creation" {
21
+ # P016: Before writing a problem file (step 5), the skill must check whether
22
+ # the description contains multiple distinct root causes or concerns, and offer
23
+ # to split if it does. This guards against conflated tickets that make WSJF
24
+ # scoring meaningless.
25
+ run grep -in "concern.boundary\|concern-boundary\|concern boundary\|boundary.*concern\|split.*concern\|multi.concern\|single.*concern" "$SKILL_FILE"
26
+ [ "$status" -eq 0 ]
27
+ }
28
+
29
+ @test "SKILL.md concern-boundary step uses AskUserQuestion, not prose (ADR-013)" {
30
+ # ADR-013 Rule 1: all branch points must use AskUserQuestion, not prose options.
31
+ # The concern-boundary split decision (split vs keep as one) is a branch point
32
+ # and must be handled with a structured AskUserQuestion call, not a
33
+ # '(a) split (b) keep' prose paragraph.
34
+ # This test verifies the split prompt references AskUserQuestion (not just that
35
+ # AskUserQuestion appears anywhere — the no-prose-options.bats test covers that).
36
+ run grep -n "concern.boundary\|concern-boundary\|concern boundary\|split.*concern\|multi.concern" "$SKILL_FILE"
37
+ [ "$status" -eq 0 ]
38
+ # The split decision must direct the skill to use AskUserQuestion
39
+ run grep -in "split.*AskUserQuestion\|AskUserQuestion.*split\|split.*question\|question.*split\|split.*ask\|concern.*AskUserQuestion\|AskUserQuestion.*concern" "$SKILL_FILE"
40
+ [ "$status" -eq 0 ]
41
+ }
42
+
43
+ @test "SKILL.md concern-boundary step is scoped to new problem creation (not updates)" {
44
+ # P016 fix must only fire during new problem creation (between steps 4 and 5),
45
+ # not during updates or transitions. Scope constraint prevents spurious split
46
+ # prompts on existing tickets being updated or transitioned.
47
+ # This checks that the concern-boundary step is placed in the 'new problems'
48
+ # section (steps 2-5), not in the update or transition sections.
49
+ run grep -n "For new problems\|new problem" "$SKILL_FILE"
50
+ [ "$status" -eq 0 ]
51
+ # The concern-boundary check must appear in the new-problems workflow context
52
+ run grep -A5 -i "concern.boundary\|concern-boundary\|concern boundary\|multi.concern\|split.*concern" "$SKILL_FILE"
53
+ [ "$status" -eq 0 ]
54
+ }
55
+
56
+ @test "SKILL.md concern-boundary step specifies non-interactive fallback with auto-split" {
57
+ # ADR-013 Rule 6: non-interactive fail-safe — when AskUserQuestion is unavailable,
58
+ # the skill must auto-split rather than hanging or silently dropping the split.
59
+ # This specifically requires "auto-split" or "automatically split" language in
60
+ # the concern-boundary step, not just general "AskUserQuestion unavailable" text
61
+ # which already exists for the commit step (step 11).
62
+ run grep -in "auto.split\|automatically split" "$SKILL_FILE"
63
+ [ "$status" -eq 0 ]
64
+ }
@@ -61,3 +61,14 @@ setup() {
61
61
  run grep -n "Scope change" "$SKILL_FILE"
62
62
  [ "$status" -eq 0 ]
63
63
  }
64
+
65
+ @test "SKILL.md step 9d requires fix summary extracted from Fix Released in AskUserQuestion (P030)" {
66
+ # P030: verification prompts must include a one-line fix summary extracted from the
67
+ # '## Fix Released' section so the user can answer without a clarifying round-trip.
68
+ # This checks that step 9d explicitly instructs including fix content in the question,
69
+ # not just detecting Fix Released to decide which problems need verification.
70
+ # The fix must add wording like "extract" or "include" + "Fix Released" + "summary"
71
+ # (or "question") within step 9d. A generic "Fix Released" mention is insufficient.
72
+ run grep -n "fix summary\|Fix Released.*question\|Fix Released.*summary\|extract.*Fix Released\|include.*Fix Released\|summary.*Fix Released" "$SKILL_FILE"
73
+ [ "$status" -eq 0 ]
74
+ }
@@ -0,0 +1,30 @@
1
+ #!/usr/bin/env bats
2
+ # Doc-lint guard: manage-problem SKILL.md must include the output formatting rule
3
+ # requiring human-readable titles alongside bare IDs (P032).
4
+ #
5
+ # Structural assertion — Permitted Exception to the source-grep ban (ADR-005 / P011).
6
+ # These tests assert that the skill specification document contains the output
7
+ # formatting instruction so agents include titles with IDs in prose output.
8
+ #
9
+ # Cross-reference:
10
+ # P032 (agent output uses opaque IDs without titles)
11
+ # @jtbd JTBD-001 (enforce governance without slowing down)
12
+
13
+ setup() {
14
+ SKILL_DIR="$(cd "$(dirname "$BATS_TEST_FILENAME")/.." && pwd)"
15
+ SKILL_FILE="${SKILL_DIR}/SKILL.md"
16
+ }
17
+
18
+ @test "SKILL.md contains output formatting section" {
19
+ run grep -n "## Output Formatting" "$SKILL_FILE"
20
+ [ "$status" -eq 0 ]
21
+ }
22
+
23
+ @test "SKILL.md output formatting rule requires titles with IDs (P032)" {
24
+ # P032: agents must include human-readable titles when referencing IDs in prose.
25
+ # The rule must mention including the title alongside IDs.
26
+ run grep -n "title" "$SKILL_FILE"
27
+ [ "$status" -eq 0 ]
28
+ run grep -n "Output Formatting" "$SKILL_FILE"
29
+ [ "$status" -eq 0 ]
30
+ }
@@ -59,10 +59,15 @@ setup() {
59
59
  [ "$status" -eq 0 ]
60
60
  }
61
61
 
62
- @test "SKILL.md describes checking README.md freshness before full re-scan" {
63
- # The work fast-path: if README.md is newer than all problem files,
64
- # skip the 18-file re-scan and read the cached table directly.
65
- # Proxy: SKILL.md mentions -newer (the find flag used for mtime comparison).
66
- run grep -q "\-newer" "$SKILL_FILE"
62
+ @test "SKILL.md describes checking README.md freshness using git history, not mtime" {
63
+ # P031: The mtime-based `find -newer` check is broken in git worktrees
64
+ # because all files receive the same mtime at checkout time.
65
+ # The cache-freshness check must use git log to compare commits, not
66
+ # filesystem timestamps.
67
+ # Positive: SKILL.md uses git log for cache freshness.
68
+ run grep -q "git log.*README\.md" "$SKILL_FILE"
67
69
  [ "$status" -eq 0 ]
70
+ # Negative: SKILL.md must NOT use find -newer for cache freshness.
71
+ run grep -q "\-newer" "$SKILL_FILE"
72
+ [ "$status" -ne 0 ]
68
73
  }
@@ -0,0 +1,150 @@
1
+ ---
2
+ name: wr-itil:work-problems
3
+ description: Batch-work ITIL problem tickets while the user is AFK. Loops through the problem backlog by WSJF priority, delegating each problem to wr-itil:manage-problem, and stops when nothing is left to progress. Use this skill whenever the user says things like "work through my problems", "grind problems", "work the backlog", "work problems while I'm away", "process problems AFK", or any request to autonomously work through multiple problem tickets without interactive input. Also trigger when the user asks to "loop" or "batch" problem work, or says they'll be away and wants problems handled.
4
+ allowed-tools: Skill, Bash, Glob, Grep, Read
5
+ ---
6
+
7
+ # Work Problems — AFK Batch Orchestrator
8
+
9
+ Autonomously loop through ITIL problem tickets by WSJF priority, working each one via `wr-itil:manage-problem`, until nothing actionable remains.
10
+
11
+ The user is AFK during this process, so every decision point that would normally require interactive input should be resolved automatically using safe defaults. The skill reports progress between iterations so the user can review what happened when they return.
12
+
13
+ ## How It Works
14
+
15
+ Each iteration is one cycle of: scan backlog, pick highest-WSJF problem, work it, report result. The loop continues until a stop condition is met.
16
+
17
+ ### Step 1: Scan the backlog
18
+
19
+ Read `docs/problems/README.md` if it exists and is fresh (check via git history — see manage-problem step 9 for the cache freshness check). If stale or missing, scan all `.open.md` and `.known-error.md` files in `docs/problems/`, extract their WSJF scores, and rank them.
20
+
21
+ Exclude:
22
+ - `.closed.md` files (done)
23
+ - `.parked.md` files (blocked on upstream)
24
+ - Problems with no WSJF score (need a review first — run `/wr-itil:manage-problem review` as the first iteration if scores are missing)
25
+
26
+ ### Step 2: Check stop conditions
27
+
28
+ Stop the loop and report a summary if any of these are true:
29
+
30
+ 1. **No actionable problems** — zero open or known-error problems remain
31
+ 2. **All remaining problems require interactive input** — e.g., they all need user verification (known-errors with `## Fix Released`), or their scope expanded beyond what's safe to auto-resolve
32
+ 3. **All remaining problems are blocked** — investigation hit a dead end, or the fix requires changes outside the project
33
+
34
+ When stopping, output a summary table of what was worked and what remains, then output exactly:
35
+
36
+ ```
37
+ ALL_DONE
38
+ ```
39
+
40
+ This sentinel line allows external scripts to detect completion.
41
+
42
+ ### Step 3: Pick the highest-WSJF problem
43
+
44
+ Select the problem with the highest WSJF score. If there's a tie, prefer:
45
+ 1. Known Errors over Open problems (they have a confirmed fix path — less risk of wasted effort)
46
+ 2. Smaller effort over larger (faster throughput)
47
+ 3. Older reported date (longer wait = higher urgency)
48
+
49
+ ### Step 4: Classify each problem
50
+
51
+ Read the problem file and apply these deterministic rules:
52
+
53
+ | Problem state | Action |
54
+ |---|---|
55
+ | Known Error with `## Fix Released` | **Skip** — needs user verification |
56
+ | Known Error with fix strategy documented | **Work it** — implement the fix |
57
+ | Known Error without fix strategy | **Work it** — produce a fix strategy, then implement |
58
+ | Open problem with preliminary hypothesis or investigation notes | **Work it** — continue the investigation |
59
+ | Open problem with no leads (empty Root Cause Analysis) | **Work it** — read the relevant code, form a hypothesis, document findings |
60
+ | Problem previously attempted twice without progress in this session | **Skip** — mark as stuck, needs interactive attention |
61
+
62
+ The default is to work the problem. Only skip when the rule explicitly says so. This is an AFK loop — forward progress matters more than avoiding dead ends, because dead ends are cheap (findings are saved) and interactive input is expensive (user is absent).
63
+
64
+ **Time-box each problem** to avoid runaway investigation: the delegated `manage-problem` skill's internal logic decides scope. If investigation reveals the scope has grown (e.g., effort was estimated S but turns out to be L), save findings to the problem file, update the WSJF score, and move to the next problem. Never sink unbounded effort into one problem during AFK mode.
65
+
66
+ If a problem is skipped by this step, add it to a "skipped" list with the reason and loop back to step 3 for the next one.
67
+
68
+ ### Step 5: Work the problem
69
+
70
+ Invoke the manage-problem skill:
71
+
72
+ ```
73
+ /wr-itil:manage-problem work highest WSJF problem that can be progressed non-interactively as the user is AFK
74
+ ```
75
+
76
+ The manage-problem skill will:
77
+ - Run a review if the cache is stale
78
+ - Select and work the highest-WSJF problem
79
+ - Use its built-in non-interactive fallbacks (auto-split multi-concern problems, auto-commit when risk is within appetite)
80
+ - Commit completed work per ADR-014
81
+
82
+ ### Step 6: Report progress
83
+
84
+ After each iteration, report:
85
+ - Which problem was worked (ID + title)
86
+ - What was done (investigated, transitioned to known-error, fix implemented, etc.)
87
+ - The outcome (success, partially progressed, skipped, scope expanded)
88
+ - How many problems remain in the backlog
89
+
90
+ Format as a brief status line, not a wall of text. The user will read these when they return.
91
+
92
+ **Example:**
93
+ ```
94
+ [Iteration 1] Worked P029 (Edit gate overhead for governance docs) — implemented fix, closed. 8 problems remain.
95
+ [Iteration 2] Worked P021 (Governance skill structured prompts) — investigated root cause, transitioned to known-error. 7 problems remain.
96
+ [Iteration 3] Skipped P016 (Multi-concern ticket splitting) — fix released, awaiting user verification. Worked P024 (Risk scorer WIP flag) — implemented fix, closed. 6 problems remain.
97
+ ```
98
+
99
+ ### Step 7: Loop
100
+
101
+ Go back to step 1. The backlog may have changed — new problems may have been created during fixes, priorities may have shifted, and the README.md cache will be stale.
102
+
103
+ ## Non-Interactive Decision Making
104
+
105
+ When `AskUserQuestion` is unavailable or the user is AFK, the skill (and the delegated manage-problem skill) should resolve decisions automatically:
106
+
107
+ | Decision Point | Non-Interactive Default |
108
+ |---|---|
109
+ | Which problem to work | Highest WSJF, no prompt needed |
110
+ | Multi-concern split | Auto-split (manage-problem step 4b fallback) |
111
+ | Scope expansion during work | Update problem file, re-score WSJF, move to next problem instead of continuing |
112
+ | Commit when risk within appetite | Auto-commit (manage-problem step 9e fallback) |
113
+ | Commit when risk above appetite | Skip commit, report uncommitted state |
114
+ | Fix verification needed | Skip problem, add to "needs verification" list |
115
+
116
+ ## Edge Cases
117
+
118
+ **Review needed first**: If no problems have WSJF scores, run `/wr-itil:manage-problem review` as the first iteration to score everything, then proceed to the work loop.
119
+
120
+ **Scope creep during investigation**: If investigating an open problem reveals the scope is larger than expected (effort re-sized from S to L), save findings to the problem file, update the WSJF score, and move to the next problem. Don't sink unlimited effort into one problem during AFK mode — the user can decide when they return.
121
+
122
+ **Circular work**: If the same problem keeps appearing as highest-WSJF across iterations without making progress, skip it after the second attempt and note it as "stuck — needs interactive attention".
123
+
124
+ **Git conflicts**: If a commit fails due to conflicts, stop the loop and report the conflict. Don't try to resolve conflicts non-interactively.
125
+
126
+ ## Output Format
127
+
128
+ The skill should produce a final summary when the loop ends:
129
+
130
+ ```
131
+ ## Work Problems Summary
132
+
133
+ ### Completed
134
+ | # | Problem | Action | Result |
135
+ |---|---------|--------|--------|
136
+ | 1 | P029 (Edit gate overhead) | Implemented fix | Closed |
137
+ | 2 | P021 (Structured prompts) | Investigated root cause | Transitioned to Known Error |
138
+
139
+ ### Skipped
140
+ | Problem | Reason |
141
+ |---------|--------|
142
+ | P016 (Multi-concern splitting) | Awaiting user verification |
143
+
144
+ ### Remaining Backlog
145
+ | WSJF | Problem | Status |
146
+ |------|---------|--------|
147
+ | 9.0 | P012 (Skill testing harness) | Open |
148
+
149
+ ALL_DONE
150
+ ```
@@ -0,0 +1,45 @@
1
+ {
2
+ "skill_name": "wr-itil:work-problems",
3
+ "evals": [
4
+ {
5
+ "id": 1,
6
+ "prompt": "work through my problems while I'm away — I'll be AFK for a bit so just grind through whatever you can",
7
+ "expected_output": "Works highest-WSJF problems in sequence, reports progress per iteration, outputs ALL_DONE when nothing is left",
8
+ "files": [],
9
+ "expectations": [
10
+ "Output includes a progress line for each iteration with problem ID and title",
11
+ "Output includes a final summary table with Completed and Skipped sections",
12
+ "Output ends with the ALL_DONE sentinel",
13
+ "Problems are worked in WSJF priority order (highest first)",
14
+ "Problems needing user verification (Fix Released) are skipped with reason",
15
+ "Each iteration reports how many problems remain"
16
+ ]
17
+ },
18
+ {
19
+ "id": 2,
20
+ "prompt": "grind the problem backlog for me — do as many as you can without asking me anything",
21
+ "expected_output": "Same loop behavior but triggered by different phrasing. Should still work WSJF-ordered, skip interactive decisions, report progress.",
22
+ "files": [],
23
+ "expectations": [
24
+ "Skill triggers correctly from casual 'grind the backlog' phrasing",
25
+ "Does not use AskUserQuestion during execution",
26
+ "Commits work automatically when risk is within appetite",
27
+ "Handles scope expansion conservatively (saves findings, moves to next problem)",
28
+ "Stops when no more actionable problems remain"
29
+ ]
30
+ },
31
+ {
32
+ "id": 3,
33
+ "prompt": "I need to step out — can you work through the open problems? Start with the highest priority ones. Don't wait for me on anything, just make the best call you can.",
34
+ "expected_output": "Runs review if cache is stale, then loops through problems. Demonstrates the review-first-then-work pattern and handles edge cases.",
35
+ "files": [],
36
+ "expectations": [
37
+ "Runs a review or uses cached rankings before starting work",
38
+ "Known Errors are preferred over Open problems when WSJF is tied",
39
+ "If a problem is attempted twice without progress, it is skipped as stuck",
40
+ "Git conflicts cause the loop to stop with a clear report",
41
+ "Final output includes Remaining Backlog section showing what's left"
42
+ ]
43
+ }
44
+ ]
45
+ }