@windyroad/itil 0.23.3-preview.259 → 0.23.4-preview.261

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,5 +1,5 @@
1
1
  {
2
2
  "name": "wr-itil",
3
- "version": "0.23.3",
3
+ "version": "0.23.4",
4
4
  "description": "ITIL-aligned IT service management for Claude Code"
5
5
  }
@@ -0,0 +1,2 @@
1
+ #!/usr/bin/env bash
2
+ exec "$(dirname "$0")/../scripts/classify-readme-drift.sh" "$@"
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@windyroad/itil",
3
- "version": "0.23.3-preview.259",
3
+ "version": "0.23.4-preview.261",
4
4
  "description": "ITIL-aligned IT service management for Claude Code (problem, and future incident/change skills)",
5
5
  "bin": {
6
6
  "windyroad-itil": "./bin/install.mjs"
@@ -66,7 +66,7 @@ Render three sections matching the README.md format so cached and live output lo
66
66
  | <score> | P<NNN> | <title> | <severity> | <status> | <effort> |
67
67
  ```
68
68
 
69
- **Verification Queue** — `.verifying.md` tickets, sorted by release age (oldest first):
69
+ **Verification Queue** — `.verifying.md` tickets, sorted by `Released date ASC` (oldest at row 1; same-day releases tiebreak by ID ASC) per ADR-022 + P048 user-task semantics. <!-- VQ-SORT-DIRECTION: oldest-first per ADR-022 --> Drift here re-opens P150.
70
70
 
71
71
  ```
72
72
  | ID | Title | Released | Likely verified? |
@@ -193,9 +193,33 @@ The `wr-itil-reconcile-readme` command is a `$PATH`-resolved shim shipped in `pa
193
193
 
194
194
  Exit-code routing:
195
195
  - **Exit 0 (clean)**: continue to Step 1.
196
- - **Exit 1 (drift detected)**: structured diff lines printed to stdout, one per drift entry (≤150 bytes per ADR-038 progressive-disclosure budget). **Halt this invocation** with a directive to invoke `/wr-itil:reconcile-readme` (interactive mode) or auto-route through the same skill in non-interactive mode (per ADR-013 Rule 6, AFK orchestrator). The reconciliation must complete and commit before this manage-problem invocation proceeds proceeding into ticket creation / update / transition with a stale README would re-encode the drift into the post-operation refresh and propagate the lie.
196
+ - **Exit 1 (drift detected)**: structured diff lines printed to stdout, one per drift entry (≤150 bytes per ADR-038 progressive-disclosure budget). Capture stdout to a temp file and classify the drift via the **uncommitted-rename carve-out** (P149) before halt-routingsee "Drift classification carve-out" immediately below.
197
197
  - **Exit 2 (parse error)**: README missing or malformed. Halt with the parse-error message; this needs investigation, not mechanical reconciliation. AFK orchestrators halt-with-report per ADR-013 Rule 6.
198
198
 
199
+ #### Drift classification carve-out (P149)
200
+
201
+ The Exit 1 halt-and-route path is correct for **committed cross-session drift** — a past session committed a ticket transition without staging the README refresh, and proceeding now would re-encode the drift into the post-operation refresh and propagate the lie. It is **wrong for uncommitted-rename-rooted drift** — when the current working tree carries a staged ticket rename (a same-session `git mv` that the in-flow P094 / P062 refresh at Step 5 / Step 7 will reconcile in the upcoming commit per ADR-014's single-commit grain). Halting in the latter case forces a separate `/wr-itil:reconcile-readme` commit, splitting one logical change across two commits and violating the ADR-014 grain.
202
+
203
+ Run the classifier on Exit 1 to distinguish the two cases:
204
+
205
+ ```bash
206
+ wr-itil-reconcile-readme docs/problems > /tmp/wr-itil-drift-$$.txt
207
+ reconcile_exit=$?
208
+ if [ "$reconcile_exit" -eq 1 ]; then
209
+ wr-itil-classify-readme-drift /tmp/wr-itil-drift-$$.txt docs/problems
210
+ classify_exit=$?
211
+ rm -f /tmp/wr-itil-drift-$$.txt
212
+ fi
213
+ ```
214
+
215
+ The `wr-itil-classify-readme-drift` command is a `$PATH`-resolved shim (ADR-049 naming grammar) dispatching `packages/itil/scripts/classify-readme-drift.sh`. It cross-references the drifting IDs from the script's stdout against `git status --porcelain docs/problems/` filtered for staged rename (`R`) entries — the destination path's ticket ID is the post-rename status the in-flow refresh will reconcile.
216
+
217
+ Classifier exit-code routing:
218
+
219
+ - **`classify_exit == 0` (INLINE_REFRESH)**: every drifting ID is the destination of a staged rename in the working tree. Log a one-line note ("Step 0 reconcile drift covered by N staged rename(s); deferring README refresh to in-flow Step 5 / Step 7 per P094 / P062 + ADR-014 single-commit grain") and continue to Step 1. Do NOT invoke `/wr-itil:reconcile-readme` — the in-flow refresh will land the README correction in the same commit as the ticket work.
220
+ - **`classify_exit == 1` (HALT_ROUTE_RECONCILE)**: at least one drifting ID is NOT covered by a staged rename — committed cross-session drift OR mixed (some IDs in working tree, some committed-only). **Halt this invocation** with a directive to invoke `/wr-itil:reconcile-readme` (interactive mode) or auto-route through the same skill in non-interactive mode (per ADR-013 Rule 6, AFK orchestrator). The reconciliation must complete and commit before this manage-problem invocation proceeds. Mixed routes to halt because `/wr-itil:reconcile-readme` resolves both classes safely; the in-flow refresh only handles the rename'd subset.
221
+ - **`classify_exit == 2` (parse error)**: classifier received empty / missing drift input — contract violation upstream. Fall back to the conservative halt-and-route path.
222
+
199
223
  This is a **preflight CHECK only** — manage-problem does NOT itself apply edits. The edit application lives in `/wr-itil:reconcile-readme`'s Step 4 with narrative preservation. Per architect verdict on P118 (Q3): manage-problem and work-problems Step 0 invoke the script (cheap mechanical check); transition-problem does NOT (P062 already covers transition-time refresh inside the same commit, redundant preflight there would pay the cost on every transition).
200
224
 
201
225
  This step is a robustness layer ON TOP of P094 + P062, not a supersession of either — both per-operation contracts remain in force at Step 5 (creation refresh) and Step 7 (transition refresh).
@@ -410,6 +434,8 @@ After writing the new `.open.md` file, regenerate `docs/problems/README.md` to i
410
434
 
411
435
  **WSJF Rankings tie-break sort (P138)**: rows in the WSJF Rankings table are sorted by the multi-key `(WSJF desc, Known-Error-first, Effort-divisor asc, Reported-date asc, ID asc)` so the rendered top-to-bottom row order matches `/wr-itil:work-problems` SKILL.md Step 3's tie-break selection 1:1. The first key (WSJF desc) sets the tier; within a tier the next three keys are the canonical tie-break ladder (Known Error before Open; smaller effort before larger; older Reported date before newer); ID asc is the deterministic final tiebreaker for full-tie cases. The table MUST include a `Reported` column so the third tie-break input is visible to README readers — without it, users cannot reconcile the rendered order against the orchestrator's selection. <!-- TIE-BREAK-LADDER-SOURCE: /wr-itil:work-problems SKILL.md Step 3 --> Any future change to the tie-break ladder MUST update this render block, the Step 7 P062 block, the Step 9e template, AND `/wr-itil:review-problems` SKILL.md Step 3 / Step 5 — drift here re-opens P138.
412
436
 
437
+ **Verification Queue sort direction (P150)**: rows in the Verification Queue table are sorted by `Released date ASC` (oldest at row 1; same-day releases tiebreak by ID ASC) per ADR-022 + P048 user-task semantics — older entries are the most likely-verified candidates the user wants to surface first when closing the queue. Newest-first ordering pushes those actionable closure candidates below the fold and contradicts the section header. <!-- VQ-SORT-DIRECTION: oldest-first per ADR-022 --> Any future change to the VQ sort direction MUST update this render block, the Step 7 P062 block, the Step 9c presentation block, the Step 9e template, AND `/wr-itil:review-problems` + `/wr-itil:transition-problem` + `/wr-itil:transition-problems` + `/wr-itil:reconcile-readme` + `/wr-itil:list-problems` — drift here re-opens P150.
438
+
413
439
  1. After `Write`-ing the new `.open.md` file (and, for multi-concern splits per step 4b, after all split files are written), regenerate `docs/problems/README.md` in-place reflecting the new filename set.
414
440
  2. Update the "Last reviewed" line per the **Last-reviewed line discipline (P134)** subsection below — name the new ticket as the most-recent fragment (e.g. `P<NNN> opened — <one-line title>`); displaced prior fragments rotate to `docs/problems/README-history.md`.
415
441
  3. `git add docs/problems/README.md` — the stage list at Step 11 must include it alongside the new `.open.md` file (Step 11's `git add -u` catch-all handles tracked-file modifications; the new README render lands via this path when README.md already exists in git, and via an explicit `git add docs/problems/README.md` when it is newly created). When line-3 truncation displaces prior content, also `git add docs/problems/README-history.md`.
@@ -575,6 +601,8 @@ The refresh uses the same rendering rules as Step 9e (glob `docs/problems/*.open
575
601
 
576
602
  **WSJF Rankings tie-break sort (P138)**: rows in the WSJF Rankings table are sorted by the multi-key `(WSJF desc, Known-Error-first, Effort-divisor asc, Reported-date asc, ID asc)` so the rendered top-to-bottom row order matches `/wr-itil:work-problems` SKILL.md Step 3's tie-break selection 1:1. Within each WSJF tier, rows are ordered by the canonical tie-break ladder: Known Error before Open, smaller Effort before larger, older Reported date before newer. The table MUST include a `Reported` column so the third tie-break input is visible to README readers. <!-- TIE-BREAK-LADDER-SOURCE: /wr-itil:work-problems SKILL.md Step 3 --> Any future change to the tie-break ladder MUST update this render block, the Step 5 P094 block, the Step 9e template, AND `/wr-itil:review-problems` SKILL.md Step 3 / Step 5 — drift here re-opens P138.
577
603
 
604
+ **Verification Queue sort direction (P150)**: rows in the Verification Queue table are sorted by `Released date ASC` (oldest at row 1; same-day releases tiebreak by ID ASC) per ADR-022 + P048 user-task semantics — older entries are the most likely-verified candidates the user wants to surface first when closing the queue. Newest-first ordering pushes those actionable closure candidates below the fold and contradicts the section header. <!-- VQ-SORT-DIRECTION: oldest-first per ADR-022 --> Any future change to the VQ sort direction MUST update this render block, the Step 5 P094 block, the Step 9c presentation block, the Step 9e template, AND `/wr-itil:review-problems` + `/wr-itil:transition-problem` + `/wr-itil:transition-problems` + `/wr-itil:reconcile-readme` + `/wr-itil:list-problems` — drift here re-opens P150.
605
+
578
606
  **Mechanism:**
579
607
 
580
608
  1. After renaming + Editing + `git add`-ing the transitioned ticket file (per the staging-trap rule above), regenerate `docs/problems/README.md` in-place reflecting the new filename set and the transitioned ticket's new Status.
@@ -667,7 +695,7 @@ After reviewing all problems, present a WSJF-ranked table for open/known-error p
667
695
  | WSJF | ID | Title | Severity | Status | Effort | Reported | Notes |
668
696
  |------|-----|-------|----------|--------|--------|----------|-------|
669
697
 
670
- Then present a separate **Verification Queue** section for `.verifying.md` files (per ADR-022 — ranked by release age, oldest first; no WSJF because the multiplier is 0). Highlight each ticket whose release age is **≥ 14 days** (the within-skill default per P048 Candidate 4 — tunable; if it needs cross-skill consistency later, promote to policy) with a `likely verified` marker in the final column. This makes the Verification Queue not just a list but a ranked view of which verifications are most likely ready to close:
698
+ Then present a separate **Verification Queue** section for `.verifying.md` files (per ADR-022 — ranked by release age, oldest first; no WSJF because the multiplier is 0). Sort key + direction is the canonical `Released date ASC` (oldest at row 1; same-day releases tiebreak by ID ASC) — drift here re-opens P150. <!-- VQ-SORT-DIRECTION: oldest-first per ADR-022 --> Highlight each ticket whose release age is **≥ 14 days** (the within-skill default per P048 Candidate 4 — tunable; if it needs cross-skill consistency later, promote to policy) with a `likely verified` marker in the final column. This makes the Verification Queue not just a list but a ranked view of which verifications are most likely ready to close — older entries are the most likely-verified candidates the user wants to surface first when closing the queue:
671
699
 
672
700
  | ID | Title | Released | Fix summary | Likely verified? |
673
701
  |----|-------|----------|-------------|------------------|
@@ -727,7 +755,7 @@ Edit each problem file where the priority changed. Then write/overwrite `docs/pr
727
755
 
728
756
  ## Verification Queue
729
757
 
730
- Fix released, awaiting user verification (driven off `docs/problems/*.verifying.md` via glob — per ADR-022). Ranked by release age, oldest first:
758
+ Fix released, awaiting user verification (driven off `docs/problems/*.verifying.md` via glob — per ADR-022). Sorted by `Released date ASC` (oldest at row 1; same-day releases tiebreak by ID ASC). <!-- VQ-SORT-DIRECTION: oldest-first per ADR-022 --> Drift here re-opens P150 — any change to VQ sort direction MUST update the Step 5 P094 block, the Step 7 P062 block, the Step 9c presentation block, this template, AND `/wr-itil:review-problems` + `/wr-itil:transition-problem` + `/wr-itil:transition-problems` + `/wr-itil:reconcile-readme` + `/wr-itil:list-problems`.
731
759
 
732
760
  | ID | Title | Released | Fix summary |
733
761
  |----|-------|----------|-------------|
@@ -0,0 +1,206 @@
1
+ #!/usr/bin/env bats
2
+
3
+ # P150: docs/problems/README.md Verification Queue must be rendered
4
+ # oldest-first (by Released date ASC, oldest at row 1) per ADR-022 +
5
+ # P048 user-task semantics. The header has long claimed "Ranked by
6
+ # release age, oldest first" while the rendered table drifted to
7
+ # newest-first across multiple SKILL.md render sites. This file
8
+ # encodes the canonical sort spec + greppable VQ-SORT-DIRECTION
9
+ # marker as a contract assertion across every render block, plus a
10
+ # behavioural fixture that asserts the actual sort outcome.
11
+ #
12
+ # Hybrid coverage per ADR-005 + ADR-037:
13
+ # - Structural contract-assertions (Permitted Exception per ADR-005 /
14
+ # contract-assertion pattern per ADR-037): each of the render-block
15
+ # sites carries the canonical VQ-SORT-DIRECTION marker.
16
+ # - One behavioural fixture sort: 4 .verifying.md tickets with known
17
+ # Released dates. Apply the documented ASC-by-date sort. Assert
18
+ # row 1 = the oldest entry; row N = the newest.
19
+ #
20
+ # @problem P150
21
+ # @jtbd JTBD-001 (enforce governance without slowing down — predictable
22
+ # render order visible across the README and from `list-problems`)
23
+ # @jtbd JTBD-006 (progress backlog AFK — verification candidates ready
24
+ # to close are at the top of the queue, not the bottom)
25
+ #
26
+ # Cross-reference:
27
+ # P150: docs/problems/150-readme-verification-queue-rendered-newest-first-contradicts-oldest-first-header.*.md
28
+ # P138: sibling fix on the WSJF Rankings table — same fix shape
29
+ # P048: introduced the Verification Queue + Likely verified column
30
+ # ADR-005 — plugin testing strategy / Permitted Exception
31
+ # ADR-022 — `.verifying.md` lifecycle; VQ rendering
32
+ # ADR-037 — contract-assertion bats pattern
33
+
34
+ setup() {
35
+ REPO_ROOT="$(cd "$(dirname "$BATS_TEST_FILENAME")/../../../../.." && pwd)"
36
+ MANAGE_SKILL="$REPO_ROOT/packages/itil/skills/manage-problem/SKILL.md"
37
+ REVIEW_SKILL="$REPO_ROOT/packages/itil/skills/review-problems/SKILL.md"
38
+ TRANSITION_SKILL="$REPO_ROOT/packages/itil/skills/transition-problem/SKILL.md"
39
+ TRANSITIONS_SKILL="$REPO_ROOT/packages/itil/skills/transition-problems/SKILL.md"
40
+ RECONCILE_SKILL="$REPO_ROOT/packages/itil/skills/reconcile-readme/SKILL.md"
41
+ LIST_SKILL="$REPO_ROOT/packages/itil/skills/list-problems/SKILL.md"
42
+
43
+ TEST_TMP="$(mktemp -d)"
44
+ }
45
+
46
+ teardown() {
47
+ if [ -n "${TEST_TMP:-}" ] && [ -d "$TEST_TMP" ]; then
48
+ rm -rf "$TEST_TMP"
49
+ fi
50
+ }
51
+
52
+ # ---------------------------------------------------------------------------
53
+ # Structural contract-assertions — VQ-SORT-DIRECTION marker
54
+ # ---------------------------------------------------------------------------
55
+
56
+ @test "manage-problem render blocks carry the VQ-SORT-DIRECTION marker" {
57
+ # Each render block writing the Verification Queue must carry the
58
+ # canonical greppable marker pointing back to ADR-022 (the
59
+ # framework-resolved source of the VQ ordering contract). Drift
60
+ # across render sites re-opens P150.
61
+ run grep -F '<!-- VQ-SORT-DIRECTION: oldest-first per ADR-022 -->' "$MANAGE_SKILL"
62
+ [ "$status" -eq 0 ]
63
+ # Marker must appear at the three manage-problem render sites:
64
+ # Step 5 P094 (refresh on new ticket), Step 7 P062 (refresh on
65
+ # transition), Step 9e (review-emit template).
66
+ count=$(grep -c -F '<!-- VQ-SORT-DIRECTION: oldest-first per ADR-022 -->' "$MANAGE_SKILL")
67
+ [ "$count" -ge 3 ]
68
+ }
69
+
70
+ @test "review-problems renders the VQ-SORT-DIRECTION marker" {
71
+ run grep -F '<!-- VQ-SORT-DIRECTION: oldest-first per ADR-022 -->' "$REVIEW_SKILL"
72
+ [ "$status" -eq 0 ]
73
+ }
74
+
75
+ @test "transition-problem Step 7 README refresh carries the VQ-SORT-DIRECTION marker" {
76
+ run grep -F '<!-- VQ-SORT-DIRECTION: oldest-first per ADR-022 -->' "$TRANSITION_SKILL"
77
+ [ "$status" -eq 0 ]
78
+ }
79
+
80
+ @test "transition-problems batch render carries the VQ-SORT-DIRECTION marker" {
81
+ run grep -F '<!-- VQ-SORT-DIRECTION: oldest-first per ADR-022 -->' "$TRANSITIONS_SKILL"
82
+ [ "$status" -eq 0 ]
83
+ }
84
+
85
+ @test "reconcile-readme rendering carries the VQ-SORT-DIRECTION marker" {
86
+ run grep -F '<!-- VQ-SORT-DIRECTION: oldest-first per ADR-022 -->' "$RECONCILE_SKILL"
87
+ [ "$status" -eq 0 ]
88
+ }
89
+
90
+ @test "list-problems VQ rendering carries the VQ-SORT-DIRECTION marker" {
91
+ run grep -F '<!-- VQ-SORT-DIRECTION: oldest-first per ADR-022 -->' "$LIST_SKILL"
92
+ [ "$status" -eq 0 ]
93
+ }
94
+
95
+ # ---------------------------------------------------------------------------
96
+ # Structural contract-assertions — sort-direction phrase consistency
97
+ # ---------------------------------------------------------------------------
98
+
99
+ @test "manage-problem render blocks document the Released-date ASC direction" {
100
+ # Free-form explanation of the sort key + direction must accompany
101
+ # the marker so a reader doesn't have to chase the ADR to understand
102
+ # what the marker authorises.
103
+ run grep -F 'Released date ASC' "$MANAGE_SKILL"
104
+ [ "$status" -eq 0 ]
105
+ count=$(grep -c -F 'Released date ASC' "$MANAGE_SKILL")
106
+ [ "$count" -ge 3 ]
107
+ }
108
+
109
+ @test "review-problems documents the Released-date ASC direction" {
110
+ run grep -F 'Released date ASC' "$REVIEW_SKILL"
111
+ [ "$status" -eq 0 ]
112
+ }
113
+
114
+ # ---------------------------------------------------------------------------
115
+ # Structural contract-assertions — drift-warning prose
116
+ # ---------------------------------------------------------------------------
117
+
118
+ @test "manage-problem render blocks warn that drift re-opens P150" {
119
+ # The cross-coupling note must explicitly name P150 so future agents
120
+ # who consider relaxing the VQ sort direction see the regression risk.
121
+ count=$(grep -c -F 'drift here re-opens P150' "$MANAGE_SKILL")
122
+ [ "$count" -ge 3 ]
123
+ }
124
+
125
+ @test "review-problems renders the drift-re-opens-P150 warning" {
126
+ run grep -F 'drift re-opens P150' "$REVIEW_SKILL"
127
+ [ "$status" -eq 0 ]
128
+ }
129
+
130
+ # ---------------------------------------------------------------------------
131
+ # Behavioural fixture: ASC-by-Released-date puts oldest at row 1
132
+ # ---------------------------------------------------------------------------
133
+
134
+ @test "behavioural: VQ sort by Released date ASC puts oldest entry at row 1" {
135
+ # Fixture: 4 .verifying.md tickets with known Released dates spanning
136
+ # 2026-04-22 to 2026-05-02. Encode each as a tab-separated row whose
137
+ # columns are the sort axes (Released date, ID, Title). Apply the
138
+ # documented ASC-by-Released sort and assert the output row order
139
+ # places the oldest entry at row 1 and the newest at row N.
140
+ #
141
+ # This is the regression guard against the drift documented in P150 —
142
+ # before the fix, render sites iterated newest-first and pushed the
143
+ # actionable closure candidates (oldest entries) below the fold.
144
+
145
+ fixture_in="$TEST_TMP/fixture-vq.tsv"
146
+ cat >"$fixture_in" <<'EOF'
147
+ 2026-05-02 148 P148: youngest released
148
+ 2026-04-29 144 P144: 3 days old
149
+ 2026-04-25 120 P120: week-old
150
+ 2026-04-22 093 P093: oldest released
151
+ EOF
152
+
153
+ # Canonical sort: Released date ASC (oldest first), ID ASC as final
154
+ # tiebreaker for same-day releases.
155
+ sorted=$(sort -t$'\t' -k1,1 -k2,2n "$fixture_in" | cut -f3)
156
+ expected="P093: oldest released
157
+ P120: week-old
158
+ P144: 3 days old
159
+ P148: youngest released"
160
+ [ "$sorted" = "$expected" ]
161
+ }
162
+
163
+ @test "behavioural: same-day Released uses ID ASC as the final tiebreaker" {
164
+ # Regression guard: when two tickets share a Released date, the ID
165
+ # ASC tiebreaker must produce a deterministic order. Without an
166
+ # explicit final tiebreaker, render-time row order can shift on
167
+ # every refresh and look like content drift in git diff.
168
+ fixture_in="$TEST_TMP/fixture-vq-sameday.tsv"
169
+ cat >"$fixture_in" <<'EOF'
170
+ 2026-05-02 148 P148: same day high
171
+ 2026-05-02 147 P147: same day mid
172
+ 2026-05-02 146 P146: same day low
173
+ EOF
174
+
175
+ sorted=$(sort -t$'\t' -k1,1 -k2,2n "$fixture_in" | cut -f3)
176
+ expected="P146: same day low
177
+ P147: same day mid
178
+ P148: same day high"
179
+ [ "$sorted" = "$expected" ]
180
+ }
181
+
182
+ @test "behavioural: oldest-first ordering surfaces likely-verified candidates first" {
183
+ # P048 user-task semantics: the Verification Queue exists so the user
184
+ # can close pending verifications. Older entries are more likely
185
+ # ready to close (less chance of revert). Oldest-first ordering puts
186
+ # those candidates at the top so the user lands on actionable rows
187
+ # without scrolling past fresh-release entries still in dwell-time.
188
+ #
189
+ # Fixture spans ages 0d, 1d, 14d, 30d. Assert that after sort, the
190
+ # 30-day entry is at row 1 (highest "likely verified" probability)
191
+ # and the 0-day entry is at row N (lowest probability).
192
+ today="2026-05-02"
193
+ fixture_in="$TEST_TMP/fixture-vq-ages.tsv"
194
+ cat >"$fixture_in" <<'EOF'
195
+ 2026-05-02 150 P150: 0 days no
196
+ 2026-05-01 149 P149: 1 day no
197
+ 2026-04-18 048 P048: 14 days yes
198
+ 2026-04-02 030 P030: 30 days yes
199
+ EOF
200
+
201
+ sorted=$(sort -t$'\t' -k1,1 -k2,2n "$fixture_in" | cut -f3)
202
+ first=$(printf "%s\n" "$sorted" | head -1)
203
+ last=$(printf "%s\n" "$sorted" | tail -1)
204
+ [ "$first" = "P030: 30 days yes" ]
205
+ [ "$last" = "P150: 0 days no" ]
206
+ }
@@ -99,7 +99,7 @@ For each REMOVE: `Edit` with the existing row as `old_string`, and remove it (re
99
99
 
100
100
  For each ADD to WSJF Rankings: locate the correct WSJF position by descending order. Use `Edit` to insert the new row immediately above the next-lower-WSJF row (or append at the bottom of the table if the new row's WSJF is the lowest). The Edit's `old_string` is the line that the new row inserts above; the `new_string` is the new row + the same line below.
101
101
 
102
- For each ADD to Verification Queue: append at the bottom of the VQ table (the table is loosely sorted by release age, oldest first; recent releases land at the bottom).
102
+ For each ADD to Verification Queue: insert the new row in `Released date ASC` position (oldest at row 1; same-day releases tiebreak by ID ASC) per the canonical VQ sort direction. <!-- VQ-SORT-DIRECTION: oldest-first per ADR-022 --> Recent releases land at the bottom; oldest-pending verifications surface at the top so the user lands on actionable closure candidates first per P048 user-task semantics. Drift here re-opens P150.
103
103
 
104
104
  After all edits, re-run `packages/itil/scripts/reconcile-readme.sh docs/problems` to confirm exit 0. If the second run still reports drift, investigate the residual edits — do NOT re-run reconciliation in a loop, as that hides systematic edit failures.
105
105
 
@@ -72,7 +72,7 @@ After re-scoring, present three sections matching the README.md format (same ren
72
72
  |------|-----|-------|----------|--------|--------|----------|-------|
73
73
  ```
74
74
 
75
- **Verification Queue** — `.verifying.md` tickets, sorted by release age (oldest first). Highlight any ticket whose release age is **≥ 14 days** with a `yes (N days)` marker in the `Likely verified?` column (within-skill default per P048 Candidate 4 — tunable; promote to cross-skill policy if needed):
75
+ **Verification Queue** — `.verifying.md` tickets, sorted by `Released date ASC` (oldest at row 1; same-day releases tiebreak by ID ASC) per ADR-022 + P048 user-task semantics. Older entries are the most likely-verified candidates the user wants to surface first when closing the queue; newest-first ordering pushes those actionable closure candidates below the fold and contradicts the section header. <!-- VQ-SORT-DIRECTION: oldest-first per ADR-022 --> Any change to the VQ sort direction MUST update this rendering block, Step 5's README template, AND `/wr-itil:manage-problem` SKILL.md Step 5 P094 / Step 7 P062 / Step 9c / Step 9e + `/wr-itil:transition-problem` + `/wr-itil:transition-problems` + `/wr-itil:reconcile-readme` + `/wr-itil:list-problems` — drift re-opens P150. Highlight any ticket whose release age is **≥ 14 days** with a `yes (N days)` marker in the `Likely verified?` column (within-skill default per P048 Candidate 4 — tunable; promote to cross-skill policy if needed):
76
76
 
77
77
  ```
78
78
  | ID | Title | Released | Fix summary | Likely verified? |
@@ -128,7 +128,7 @@ Dev-work queue only. Verification Pending (`.verifying.md`, WSJF multiplier 0) a
128
128
 
129
129
  ## Verification Queue
130
130
 
131
- Fix released, awaiting user verification (driven off `docs/problems/*.verifying.md` via glob per ADR-022). Ranked by release age, oldest first. `Likely verified?` column marks tickets ≥14 days old (P048 Candidate 4 default).
131
+ Fix released, awaiting user verification (driven off `docs/problems/*.verifying.md` via glob per ADR-022). Sorted by `Released date ASC` (oldest at row 1; same-day releases tiebreak by ID ASC). <!-- VQ-SORT-DIRECTION: oldest-first per ADR-022 --> `Likely verified?` column marks tickets ≥14 days old (P048 Candidate 4 default).
132
132
 
133
133
  | ID | Title | Released | Likely verified? |
134
134
  |----|-------|----------|------------------|
@@ -171,6 +171,8 @@ Every Step 7 status transition regenerates `docs/problems/README.md` and stages
171
171
 
172
172
  The refresh uses the same rendering rules as `/wr-itil:review-problems` Step 9e (glob `docs/problems/*.open.md` / `*.known-error.md` / `*.verifying.md` / `*.parked.md`; rank open/known-error by WSJF; list verifyings in the Verification Queue ordered by release age; list parkeds in the Parked section) but skips the full re-scoring pass — existing WSJF values on the ticket files are trusted. The refresh is a render, not a re-rank.
173
173
 
174
+ **Verification Queue sort direction (P150)**: Verification Queue rows are sorted by `Released date ASC` (oldest at row 1; same-day releases tiebreak by ID ASC) per ADR-022 + P048 user-task semantics — older entries are the most likely-verified candidates the user wants to surface first when closing the queue. <!-- VQ-SORT-DIRECTION: oldest-first per ADR-022 --> Drift here re-opens P150.
175
+
174
176
  **Mechanism:**
175
177
 
176
178
  1. After renaming + Editing + `git add`-ing the transitioned ticket file (per the staging-trap rule above), regenerate `docs/problems/README.md` in-place reflecting the new filename set and the transitioned ticket's new Status.
@@ -178,7 +178,7 @@ After the per-pair loop finishes, IF AT LEAST ONE PAIR SUCCEEDED:
178
178
 
179
179
  Per P062, every Step 7 status transition refreshes README.md. At the batch grain, the refresh runs ONCE — a single render reflecting ALL surviving renames + Status updates. Not N refreshes (that would force the README to thrash N times mid-batch and amplify diff noise).
180
180
 
181
- The refresh follows the same render rules as `/wr-itil:review-problems` Step 9e (glob `docs/problems/*.open.md` / `*.known-error.md` / `*.verifying.md` / `*.parked.md`; rank open + known-error by WSJF; Verification Queue ordered by release age; Parked section). It does NOT re-rank — existing WSJF values on ticket files are trusted; the refresh is a render, not a re-rank.
181
+ The refresh follows the same render rules as `/wr-itil:review-problems` Step 9e (glob `docs/problems/*.open.md` / `*.known-error.md` / `*.verifying.md` / `*.parked.md`; rank open + known-error by WSJF; Verification Queue sorted by `Released date ASC` with same-day tiebreak by ID ASC per ADR-022 + P048; Parked section). It does NOT re-rank — existing WSJF values on ticket files are trusted; the refresh is a render, not a re-rank. <!-- VQ-SORT-DIRECTION: oldest-first per ADR-022 --> Drift on the VQ sort direction re-opens P150.
182
182
 
183
183
  ```bash
184
184
  git add docs/problems/README.md
@@ -93,9 +93,33 @@ The `wr-itil-reconcile-readme` command is a `$PATH`-resolved shim shipped in `pa
93
93
 
94
94
  Exit-code routing:
95
95
  - **Exit 0 (clean)**: continue to Step 1.
96
- - **Exit 1 (drift detected)**: structured diff lines printed to stdout, one per drift entry (≤150 bytes per ADR-038 progressive-disclosure budget). Per ADR-013 Rule 6 (non-interactive AFK fail-safe), invoke `/wr-itil:reconcile-readme` to apply the corrections + commit a `chore(problems): reconcile README ...` commit, then proceed to Step 1. The reconciled README is the orchestrator's source of truth for Step 3 ranking a stale read at Step 1 would propagate the lie into the iteration's selection.
96
+ - **Exit 1 (drift detected)**: structured diff lines printed to stdout, one per drift entry (≤150 bytes per ADR-038 progressive-disclosure budget). Capture stdout to a temp file and classify the drift via the **uncommitted-rename carve-out** (P149) before halt-routingsee "Drift classification carve-out" immediately below.
97
97
  - **Exit 2 (parse error)**: README missing or malformed. Halt the loop with the parse-error message and the structured Prior-Session State report — this is a deeper repair that needs investigation, not mechanical reconciliation.
98
98
 
99
+ ##### Drift classification carve-out (P149)
100
+
101
+ The Exit 1 auto-route to `/wr-itil:reconcile-readme` is correct for **committed cross-session drift** but **wrong for uncommitted-rename-rooted drift** — when a prior AFK iter (or any in-flight session) carries a staged ticket rename that the next iteration's in-flow P094 / P062 refresh will reconcile in the upcoming commit per ADR-014's single-commit grain. Auto-routing in the latter case fires an extra `chore(problems): reconcile README ...` commit and splits one logical change across two commits, violating the grain. Worse for the AFK orchestrator: that extra commit lands BEFORE the iter's actual work commit, so the audit trail reads "reconcile, then ticket work" when the truth is "ticket work in progress, README refresh deferred to its in-flow contract".
102
+
103
+ Run the classifier on Exit 1 to distinguish the two cases:
104
+
105
+ ```bash
106
+ wr-itil-reconcile-readme docs/problems > /tmp/wr-itil-drift-$$.txt
107
+ reconcile_exit=$?
108
+ if [ "$reconcile_exit" -eq 1 ]; then
109
+ wr-itil-classify-readme-drift /tmp/wr-itil-drift-$$.txt docs/problems
110
+ classify_exit=$?
111
+ rm -f /tmp/wr-itil-drift-$$.txt
112
+ fi
113
+ ```
114
+
115
+ The `wr-itil-classify-readme-drift` command is a `$PATH`-resolved shim (ADR-049 naming grammar) dispatching `packages/itil/scripts/classify-readme-drift.sh`. It cross-references drifting IDs from the script's stdout against `git status --porcelain docs/problems/` filtered for staged rename (`R`) entries.
116
+
117
+ Classifier exit-code routing:
118
+
119
+ - **`classify_exit == 0` (INLINE_REFRESH)**: every drifting ID is the destination of a staged rename in the working tree. Log a one-line note in the iter summary ("Step 0 reconcile drift covered by N staged rename(s); deferring README refresh to in-flow Step 5 / Step 7 per P094 / P062 + ADR-014 single-commit grain") and continue to Step 1. Do NOT invoke `/wr-itil:reconcile-readme` — the in-flow refresh will land the README correction in the same commit as the iter's ticket work.
120
+ - **`classify_exit == 1` (HALT_ROUTE_RECONCILE)**: at least one drifting ID is NOT covered by a staged rename — committed cross-session drift OR mixed. Per ADR-013 Rule 6 (non-interactive AFK fail-safe), invoke `/wr-itil:reconcile-readme` to apply the corrections + commit a `chore(problems): reconcile README ...` commit, then proceed to Step 1. The reconciled README is the orchestrator's source of truth for Step 3 ranking — a stale read at Step 1 would propagate the lie into the iteration's selection. Mixed routes to halt because `/wr-itil:reconcile-readme` resolves both classes safely; the in-flow refresh only handles the rename'd subset.
121
+ - **`classify_exit == 2` (parse error)**: classifier received empty / missing drift input — contract violation upstream. Fall back to the conservative auto-route.
122
+
99
123
  This is a robustness layer ON TOP of P094 + P062, not a supersession — both per-operation contracts remain in force inside each iteration's manage-problem / transition-problem invocation.
100
124
 
101
125
  ### Step 1: Scan the backlog
@@ -266,6 +290,8 @@ rm -f "$ITER_JSON"
266
290
 
267
291
  **Idle-timeout SIGTERM (P121).** The poll loop above is the orchestrator-side guard against stuck iteration subprocesses — iters that complete their semantic work (commits land, retro runs, `ITERATION_SUMMARY` is emitted into the agent output stream) but then sit waiting on a hook timeout, a backgrounded subagent that never resolved, or some other CLI-level idle behaviour before exiting. Without the guard the orchestrator polls indefinitely; the JSON file stays 0 bytes (the CLI only flushes on exit) and wall-clock burns for ~$8/hour of subprocess overhead with no API turns. The 2026-04-25 P118 iter 5 evidence: 121 min wall-clock; final commit at ~100 min; manual SIGTERM at 121 min produced a clean 5649-byte JSON response with `is_error: false`, full `## Session Retrospective` section, parseable `ITERATION_SUMMARY` block, and `duration_ms: 2992935` (49.9 min — the real-work portion). SIGTERM is therefore a safe recovery primitive for this stuck-state class — empirically a clean exit-flush, not a destructive interrupt. Behavioural confirmation lives in `test/work-problems-step-5-idle-timeout-sigterm.bats` (P121 ships with this fixture as the second-source the production observation needed). The default `IDLE_TIMEOUT_S=3600` (60 min) leaves headroom for genuinely long architectural iters; the `WORK_PROBLEMS_IDLE_TIMEOUT_S` env-var overrides per-environment for adopters who run very long iters or want a tighter guard. The orchestrator's Step 6 progress line SHOULD annotate `(SIGTERM_SENT)` when the branch fires so the user can distinguish a SIGTERM-recovered iter from a normal completion (per JTBD-006 audit-trail expectation).
268
292
 
293
+ **SIGTERM exit-flush is conditional, not universal (P147).** The "clean exit-flush" claim above is empirically true ONLY when the subprocess has already emitted `ITERATION_SUMMARY` through the agent stream before going idle (the P118 shape: semantic work complete + retro complete, then idle-wait on some final hook). The 2026-04-29 P146 incident falsified the universal generalisation: an iteration deadlocked in a `bash until`-loop polling a backgrounded-task output file (commits had landed; ITERATION_SUMMARY had NEVER been emitted) and SIGTERM at 68m34s produced exit 143 with a **0-byte JSON file**. `claude -p --output-format json` writes the entire response as a single blob ON normal exit; the SIGTERM-handler (whatever it does inside the CLI) cannot synthesise a JSON response that the agent loop never produced. **Stuck-before-emit subclass: SIGTERM still recovers wall-clock, but loses metadata.** When the orchestrator observes exit 143 + 0-byte JSON, it MUST treat the iteration as a metadata-loss event: (1) verify work integrity from independent evidence (`git log` for commits + `git status --porcelain` for tree state); (2) halt the AFK loop per exit-code semantics rather than silently continue; (3) reconstruct cost from the Anthropic billing dashboard rather than from the missing JSON envelope. The behavioural second-source for the stuck-before-emit case lives in the same `test/work-problems-step-5-idle-timeout-sigterm.bats` fixture (a fake-shim that traps SIGTERM and exits without writing stdout, asserting `JSON_BYTES=0` after the orchestrator-shape harness fires SIGTERM). Cost-of-metadata-loss < cost-of-stuck-subprocess; SIGTERM remains the right recovery primitive — the conditional caveat is about what flushes after, not whether to fire.
294
+
269
295
  **LAST_ACTIVITY_MARK signal trade-off.** The mark is `max(DISPATCH_START_EPOCH, last commit timestamp)`. The dispatch-start floor is intentional: skip-iterations that produce no commit (Step 4 routes a ticket to `action: skipped`) are bounded by `IDLE_TIMEOUT_S` since dispatch start, not by an arbitrarily-stale prior-commit timestamp. This protects against false-positive SIGTERM at iter T=0 when the most recent commit happens to be hours old. The trade-off is the inverse: a skip-iter that runs for `IDLE_TIMEOUT_S` (60 min default) will SIGTERM even though it never had a chance to commit. The 60-min default is well past the typical skip-iter wall-clock (a normal skip completes in seconds), so the trade-off rarely fires in practice; adopters who run unusually long skip-evaluation iters (e.g. deep architect-design probes) should raise `WORK_PROBLEMS_IDLE_TIMEOUT_S` accordingly. Alternative signals considered and rejected: `stat -f%m "$ITER_JSON"` (binary — file mtime only changes on subprocess exit, useless during the idle gap); subprocess RSS-change tracking (noisy; spikes during Agent-tool expansions confound the signal). The git-log signal is the cheapest reliable progress indicator the orchestrator already has.
270
296
 
271
297
  **Iteration prompt body (self-contained — the subprocess has no prior conversation context):**
@@ -635,6 +661,7 @@ When every skipped ticket is in the `upstream-blocked` category (stop-condition
635
661
  ## Related
636
662
 
637
663
  - **P121** (`docs/problems/121-afk-orchestrator-should-sigterm-stuck-subprocesses-after-idle-timeout.verifying.md`) — driver for Step 5's backgrounded-poll-loop dispatch shape (replacing the prior foreground-synchronous form) and the idle-timeout SIGTERM branch. The 2026-04-25 P118 iter 5 evidence: an iteration subprocess sat idle ~70 min after its final commit, then SIGTERM produced a clean JSON exit-flush. Fix: orchestrator backgrounds the subprocess, polls every 60s, computes `LAST_ACTIVITY_MARK = max(DISPATCH_START_EPOCH, git log -1 --format=%at HEAD)`, and sends SIGTERM when `now - LAST_ACTIVITY_MARK > WORK_PROBLEMS_IDLE_TIMEOUT_S` (default 3600s = 60 min). Behavioural second-source: `test/work-problems-step-5-idle-timeout-sigterm.bats` exercises a fake `claude -p` shim that sleeps past the threshold and asserts SIGTERM, JSON exit-flush, env-var override, and within-threshold no-fire. Step 6's per-iter progress line SHOULD annotate `(SIGTERM_SENT)` when the branch fires so users can distinguish recovered iters from natural completions. ADR-032's subprocess-boundary variant amended 2026-04-26 with the backgrounded-poll-loop refinement.
664
+ - **P147** (`docs/problems/147-p121-sigterm-clean-flush-guarantee-conditional-needs-skill-md-caveat-for-stuck-before-emit-subclass.verifying.md`) — refinement to P121's "clean exit-flush" claim. P118's evidence held only for subprocesses that had already emitted `ITERATION_SUMMARY` before going idle; the 2026-04-29 P146 incident produced exit 143 + 0-byte JSON when SIGTERM fired before `ITERATION_SUMMARY` emission. Fix: SKILL.md prose now carries the conditional caveat (Step 5 "SIGTERM exit-flush is conditional, not universal" subsection) and adopters reading the prose are directed to treat exit 143 + 0-byte JSON as a metadata-loss event — verify work integrity from `git log` + `git status --porcelain`, halt the AFK loop, and reconstruct cost from the Anthropic billing dashboard. Behavioural second-source extends `test/work-problems-step-5-idle-timeout-sigterm.bats` with a stuck-before-emit fake-shim asserting `JSON_BYTES=0` after SIGTERM. Mechanism unchanged (SIGTERM remains the right recovery primitive); the refinement is documentation accuracy + the metadata-loss-event handling shape.
638
665
  - **P089** (`docs/problems/089-work-problems-step-5-dispatch-robustness-stdin-warning-and-cost-metadata-edge-case.verifying.md`) — driver for Step 5's `< /dev/null` dispatch redirect and the Per-iteration cost metadata "Authority hierarchy" paragraph. Gap 1: stdin warning contaminated stderr-merged JSON captures; closed by adding `< /dev/null` to the canonical dispatch command. Gap 2: `.usage.*` undercounts when subprocess exits via a background-task completion ack while `.total_cost_usd` stays cumulative-authoritative; closed by documenting the authority hierarchy in Step 5 and the Session Cost output section so adopters trust cost and label token totals best-effort.
639
666
  - **P086** (`docs/problems/086-afk-iteration-subprocess-does-not-run-retro-before-returning.verifying.md`) — driver for Step 5's retro-on-exit clause. Iteration subprocesses exit without running retro, so per-iteration friction (hook misbehaviour, repeat-workaround patterns, pipeline instability) evaporates on exit. Fix: iteration prompt body names `/wr-retrospective:run-retro` as a closing step before `ITERATION_SUMMARY` emission; retro runs inside the subprocess so Step 2b pipeline-instability scan has the full tool-call history; run-retro commits its own work per ADR-014; orchestrator picks up retro-created tickets on the next Step 1 scan.
640
667
  - **P084** (`docs/problems/084-work-problems-iteration-worker-has-no-agent-tool-so-architect-jtbd-gates-block.open.md`) — driver for Step 5's subprocess-boundary dispatch. Supersedes P077's Agent-tool dispatch on the same Step 5 surface because Agent-tool-spawned subagents cannot themselves invoke Agent (platform restriction), which prevents governance gate markers from being set inside the iteration worker.
@@ -194,3 +194,107 @@ assert "total_cost_usd" in j, "cost metadata must survive SIGTERM exit-flush"
194
194
  run grep -nE 'ITER_PID=\$!|& *\n*ITER_PID|claude -p.{0,200}&[[:space:]]*$' "$SKILL_FILE"
195
195
  [ "$status" -eq 0 ]
196
196
  }
197
+
198
+ # ---------------------------------------------------------------------------
199
+ # P147 stuck-before-emit subclass: P121's "SIGTERM produces clean JSON exit-
200
+ # flush" claim was empirically grounded only against subprocesses that had
201
+ # ALREADY emitted ITERATION_SUMMARY before going idle (the P118 evidence). The
202
+ # 2026-04-29 P146 incident falsified the generalisation: a subprocess that
203
+ # deadlocked BEFORE ITERATION_SUMMARY emission produced exit 143 + a 0-byte
204
+ # JSON file when SIGTERMed. `claude -p --output-format json` writes the entire
205
+ # response as a single blob ON normal exit; SIGTERM-before-blob-write means no
206
+ # JSON is ever written.
207
+ #
208
+ # The fixture below exercises the stuck-before-emit shape with a fake `claude`
209
+ # that traps SIGTERM and exits WITHOUT writing any stdout. The orchestrator-
210
+ # shape harness then SIGTERMs after the idle threshold, and the assertions
211
+ # pin: (a) the JSON file is 0 bytes (the metadata-loss-event indicator the
212
+ # SKILL.md prose now warns adopters to watch for), and (b) SIGTERM was sent
213
+ # (the recovery primitive still fires — the bug is in the prose claim about
214
+ # what flushes, not in the SIGTERM action itself).
215
+ #
216
+ # @problem P147
217
+
218
+ dispatch_with_poll_no_emit() {
219
+ local json_file="${TEST_TMP}/iter.json"
220
+ local idle_timeout_s="${WORK_PROBLEMS_IDLE_TIMEOUT_S:-3600}"
221
+ local dispatch_start_epoch
222
+ dispatch_start_epoch=$(date +%s)
223
+ local sigterm_sent=0
224
+
225
+ : > "$json_file"
226
+ claude_no_emit -p --permission-mode bypassPermissions --output-format json "TEST" \
227
+ < /dev/null > "$json_file" 2>&1 &
228
+ local iter_pid=$!
229
+
230
+ while kill -0 "$iter_pid" 2>/dev/null; do
231
+ sleep 1
232
+ local now
233
+ now=$(date +%s)
234
+ local last_activity_mark=$dispatch_start_epoch
235
+ local idle_seconds=$(( now - last_activity_mark ))
236
+ if (( idle_seconds > idle_timeout_s )) && (( sigterm_sent == 0 )); then
237
+ kill -TERM "$iter_pid" 2>/dev/null || true
238
+ sigterm_sent=1
239
+ fi
240
+ done
241
+
242
+ wait "$iter_pid" 2>/dev/null || true
243
+
244
+ local json_bytes
245
+ json_bytes=$(wc -c < "$json_file" | tr -d ' ')
246
+
247
+ printf 'SIGTERM_SENT=%d\n' "$sigterm_sent"
248
+ printf 'JSON_BYTES=%s\n' "$json_bytes"
249
+ }
250
+
251
+ setup_no_emit_shim() {
252
+ cat > "$FAKE_BIN/claude_no_emit" <<'FAKE_EOF'
253
+ #!/usr/bin/env bash
254
+ # Test fake for work-problems Step 5 P147 stuck-before-emit fixture.
255
+ # Traps SIGTERM and exits 0 WITHOUT emitting any stdout. Mirrors the 2026-
256
+ # 04-29 P146 incident shape: subprocess deadlocked before ITERATION_SUMMARY
257
+ # emission; SIGTERM cannot flush a JSON blob the CLI never produced.
258
+ trap 'exit 0' TERM
259
+ sleep "${FAKE_SLEEP_AFTER:-30}"
260
+ FAKE_EOF
261
+ chmod +x "$FAKE_BIN/claude_no_emit"
262
+ }
263
+
264
+ @test "P147: SIGTERM-before-emit produces 0-byte JSON (stuck-before-emit subclass)" {
265
+ setup_no_emit_shim
266
+ export FAKE_SLEEP_AFTER=10
267
+ export WORK_PROBLEMS_IDLE_TIMEOUT_S=2
268
+ run dispatch_with_poll_no_emit
269
+ [ "$status" -eq 0 ]
270
+ [[ "$output" == *"SIGTERM_SENT=1"* ]]
271
+ [[ "$output" == *"JSON_BYTES=0"* ]]
272
+ }
273
+
274
+ @test "P147: SKILL.md Step 5 names the conditional caveat for SIGTERM exit-flush" {
275
+ # The prose claim must NOT generalise the P118 clean-flush observation
276
+ # universally; it must explicitly condition on ITERATION_SUMMARY having been
277
+ # emitted before SIGTERM. Adopters reading the SKILL.md need to know that
278
+ # SIGTERM-before-emit produces a 0-byte JSON, not a clean flush. Require
279
+ # the failure-mode language explicitly so a future drift that removes the
280
+ # caveat but happens to mention ITERATION_SUMMARY in a different sentence
281
+ # cannot keep this assertion green.
282
+ run grep -niE "stuck.?before.?emit|before ITERATION_SUMMARY|ITERATION_SUMMARY.{0,80}not.{0,40}(yet|been).{0,40}emit|conditional.{0,40}(caveat|on)" "$SKILL_FILE"
283
+ [ "$status" -eq 0 ]
284
+ }
285
+
286
+ @test "P147: SKILL.md Step 5 documents metadata-loss-event handling (git-evidence + halt + billing-dashboard)" {
287
+ # When SIGTERM-before-emit is observed (exit 143 + 0-byte JSON), the
288
+ # orchestrator must verify work integrity from independent evidence (git
289
+ # log + git status), halt the AFK loop per exit-code semantics, and
290
+ # reconstruct cost from the Anthropic billing dashboard. This guards
291
+ # against orchestrators silently treating exit-143-no-JSON as a normal
292
+ # iteration completion.
293
+ run grep -niE "metadata.?loss|git log.{0,80}git status|billing dashboard|reconstruct cost" "$SKILL_FILE"
294
+ [ "$status" -eq 0 ]
295
+ }
296
+
297
+ @test "P147: SKILL.md Step 5 cites P147 (conditional-caveat ticket)" {
298
+ run grep -nE "P147" "$SKILL_FILE"
299
+ [ "$status" -eq 0 ]
300
+ }