@windyroad/retrospective 0.4.0-preview.138 → 0.5.0-preview.142

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,5 +1,5 @@
1
1
  {
2
2
  "name": "wr-retrospective",
3
- "version": "0.4.0",
3
+ "version": "0.5.0",
4
4
  "description": "Session retrospective reminders and plan review for Claude Code"
5
5
  }
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@windyroad/retrospective",
3
- "version": "0.4.0-preview.138",
3
+ "version": "0.5.0-preview.142",
4
4
  "description": "Session retrospectives that update briefings and create problem tickets",
5
5
  "bin": {
6
6
  "windyroad-retrospective": "./bin/install.mjs"
@@ -43,14 +43,15 @@ For each codification candidate, also identify the **Kind** (`create` for a new
43
43
  - **ADR** — architectural decision worth recording. Route to `/wr-architect:create-adr`.
44
44
  - **JTBD** — job-to-be-done record for a persona. Route to `/wr-jtbd:update-guide`.
45
45
  - **Guide** — voice, style, or risk policy edit. Route to `/wr-voice-tone:update-guide`, `/wr-style-guide:update-guide`, or `/wr-risk-scorer:update-policy`.
46
- - **Problem ticket** — diagnostic, project-specific friction (the default for flaws). Route to `/wr-itil:manage-problem`.
47
46
  - **Test fixture** — regression test for a recurring failure pattern (bats fixture, unit test). Best fit when the observation is "this kept breaking the same way".
48
47
  - **Memory** — per-user or per-project memory note in `~/.claude/.../memory/`. Best fit for short, user-habit observations that aren't a codifiable sequence (e.g. "I always forget to run `npm run verify` before pushing").
49
48
 
49
+ **Note (P075)**: the shape list enumerates **codification outputs** — not ticketing. Every codifiable observation becomes a problem ticket in Step 4b Stage 1 regardless of shape. The shape choice is recorded as the ticket's proposed fix strategy (Stage 2), not as an alternative to ticketing. The legacy `Problem ticket` shape row has been removed; it represented a foregone decision (every observation is ticket-worthy) that is now mechanical in Stage 1.
50
+
50
51
  If no shape fits — the observation is a one-off learning, not a repeating pattern — it belongs in BRIEFING.md (Step 3), not Step 4b.
51
52
 
52
53
  Counter-examples (what does **not** become a codification candidate):
53
- - "The commit gate rejected my work twice because X was misconfigured" — diagnostic, project-specific **problem ticket** shape (route via Step 4b).
54
+ - "The commit gate rejected my work twice because X was misconfigured" — diagnostic, project-specific. Still flows through Step 4b Stage 1 ticketing; the fix strategy (Stage 2) is captured as free-text under `Other codification shape` (e.g. hook tweak, script adjustment).
54
55
  - "I always forget to run `npm run verify` before pushing" — short, user-habit rather than codifiable sequence → **memory** shape or **BRIEFING.md** note.
55
56
 
56
57
  ### 3. Update BRIEFING.md
@@ -118,69 +119,79 @@ Problems whose fix shipped but whose closure is still pending (`docs/problems/*.
118
119
  - **manage-problem Step 9d** (baseline user-initiated verification review per P048) still fires on `/wr-itil:manage-problem review` — it is the age-based heuristic path. Step 4a here is the evidence-based session-wrap path. The two compose: a ticket that is both "≥ 14 days old" (Step 9d highlight) AND "exercised successfully this session" (Step 4a candidate) should be surfaced in both paths independently; closing via either path moves the ticket to `.closed.md` and de-lists it from both queues.
119
120
  - **Skipped in this step**: `.verifying.md` tickets for fixes that ship in the currently-running session (e.g. P066, P063 just transitioned to `.verifying.md` this session) — a session cannot verify its own fix beyond "bats passed at commit time"; subsequent-session exercise is the meaningful signal. Treat same-session verifyings as "not exercised in-session" for closure purposes unless a later-session exercise path is in the citation list.
120
121
 
121
- ### 4b. Recommend new codifications
122
+ ### 4b. Two-stage codification — ticket first, fix strategy second (P075)
123
+
124
+ Every codification candidate identified in Step 2 flows through a **two-stage flow**. Stage 1 is mechanical — every candidate becomes a problem ticket; ticketing is not a user decision. Stage 2 is a per-ticket `AskUserQuestion` recording the **proposed fix strategy** as the codification shape.
125
+
126
+ **User rationale (P075)**: the legacy 19-option flat list presented a ticket-this-or-pick-another-shape choice as one option among many, but in practice the ticketing axis has a foregone answer — every codify-worthy observation is also problem-worthy. Re-asking the ticketing question is redundant. Flipping the flow collapses the redundant decision: ticket first (mechanical), fix strategy second (user-interactive).
127
+
128
+ **Skill candidate / Codification candidate backward compatibility**: the legacy `Skill candidate` and `Codification candidate` AskUserQuestion headers are superseded by Stage 2's `Proposed fix` header. The P044 / P050 / P051 enforcement intents are preserved — they now ride in Stage 2 Options 1–3 on a per-ticket basis rather than as one option among many for a single batch prompt.
129
+
130
+ #### Stage 1: Ticket every codify-worthy observation (mechanical — no user decision)
131
+
132
+ For every codifiable observation identified in Step 2:
133
+
134
+ 1. **Apply P016 concern-boundary analysis**: if the observation covers multiple independent concerns, split into N observations before ticketing. One ticket per concern.
135
+ 2. **Invoke `/wr-itil:manage-problem`** via the Skill tool to create a problem ticket. The observation text becomes the ticket Description; the retro narrative populates the Root Cause Analysis; the `## Related` section cites this retro run. (Once the ADR-032 `capture-*` background sibling ships for manage-problem, Stage 1 can delegate to `/wr-itil:capture-problem` instead so ticketing runs out of the foreground turn; same contract, different invocation mode.)
136
+
137
+ **ADR-032 note**: Stage 1 is a legitimate **foreground-spawns-N-background fanout** pattern — run-retro's foreground context spawns one background capture invocation per observation (when the background sibling exists). ADR-032's Confirmation section must carry this case; cite `ADR-032` (`docs/decisions/032-governance-skill-invocation-patterns.proposed.md`) explicitly when the background path lands.
138
+
139
+ **Ownership boundary** (same as Step 4a): run-retro surfaces the observation and delegates ticket creation to `/wr-itil:manage-problem`. The delegated skill renames, edits, and commits per ADR-014. run-retro does not commit its own work.
140
+
141
+ **Non-interactive / AFK branch**: Stage 1 fires regardless — ticketing is mechanical and does not require user input. If the delegated skill itself is unavailable (e.g. the Skill tool is gated out of the current context), record the observation in the retro summary's "Tickets Deferred" section so the user can ticket on return. Do NOT skip recording the observation.
122
142
 
123
- For each **codification candidate** identified in Step 2, route the decision through a single `AskUserQuestion` call. This is the ADR-013 Rule 1 structured-interaction pattern do not present the choices as prose enumeration in the skill output. The shape and Kind identified in Step 2 determine which option rows the user picks from; every shape and Kind routes through the same `AskUserQuestion` so the decision stays one structured interaction (architect decision: flat shape-prefixed options, not a two-step type-then-action or Kind-then-shape flow).
143
+ #### Stage 2: Record proposed fix strategy on each ticket (user-interactiveper ticket)
124
144
 
125
- For each candidate, invoke `AskUserQuestion` with:
126
- - `header: "Codification candidate"`
145
+ For each ticket created in Stage 1, invoke `AskUserQuestion` to record the proposed codification shape as the fix strategy. This is a per-ticket interaction — the fix-shape judgement is ticket-specific, not a single batch decision.
146
+
147
+ For each ticket:
148
+ - `header: "Proposed fix"`
127
149
  - `multiSelect: false`
128
- - Options (a flat list; each option names the shape and Kind up front so the decision is auditable):
129
-
130
- **Creation axis (Kind: create)** new outputs:
131
- 1. `Skill create stub` — description: "Record a stub candidate (suggested name, scope, triggers, prior uses) for a future scaffolding flow. Skill scaffolding itself is out of scope for this retrospective."
132
- 2. `Agentcreate stub` — description: "Record a stub candidate for a new agent (suggested name, scope, trigger conditions, delegating skill). Place under `packages/<plugin>/agents/` when scaffolded."
133
- 3. `Hook — create stub` — description: "Record a stub candidate for a new hook (event: PreToolUse / PostToolUse / UserPromptSubmit; trigger; action summary)."
134
- 4. `Settings propose entry` description: "Record a proposed `.claude/settings.json` entry (allowlist / env / hook wiring) for later review."
135
- 5. `Script — create stub` — description: "Record a stub `scripts/*.sh` or `scripts/*.mjs` candidate (shebang + TODO + scope)."
136
- 6. `CI propose step` — description: "Record a proposed `.github/workflows/ci.yml` insertion."
137
- 7. `ADR — invoke create-adr` — description: "Delegate to `/wr-architect:create-adr` so the decision is captured with proper MADR structure. Routing skill, not a stub."
138
- 8. `JTBD invoke update-guide` — description: "Delegate to `/wr-jtbd:update-guide` to add or amend a job-to-be-done record. Routing skill, not a stub."
139
- 9. `Guide — invoke update-guide / update-policy` — description: "Delegate to `/wr-voice-tone:update-guide`, `/wr-style-guide:update-guide`, or `/wr-risk-scorer:update-policy` depending on the guide touched."
140
- 10. `Problem invoke manage-problem` — description: "Delegate to `/wr-itil:manage-problem` so the candidate is WSJF-ranked against other backlog items. Routing skill, not a stub."
141
- 11. `Test fixture — create stub` — description: "Record a candidate bats / unit-test fixture for the recurring failure pattern."
142
- 12. `Memory — propose note` — description: "Record a proposed memory note (per-user or per-project) for a short user-habit observation that isn't a codifiable sequence."
143
-
144
- **Improvement axis (Kind: improve)** — targeted edits to existing outputs (P051):
145
- 13. `Skill — improvement stub` — description: "Record a proposed targeted edit to an existing skill's SKILL.md (file path, observed flaw, evidence, edit summary). Use when an existing skill has a bounded, reproducible gap."
146
- 14. `Agent — improvement stub` — description: "Record a proposed targeted edit to an existing agent file (path, observed flaw, edit summary)."
147
- 15. `Hook — improvement stub` — description: "Record a proposed targeted edit to an existing hook script or `.claude/settings.json` wiring."
148
- 16. `ADR — supersede or amend` — description: "Delegate to `/wr-architect:create-adr` with a `supersedes ADR-N` hint so the new ADR explicitly replaces or amends the outdated one. Routing skill, not a stub."
149
- 17. `Guide — improvement edit` — description: "Delegate to `/wr-voice-tone:update-guide`, `/wr-style-guide:update-guide`, `/wr-jtbd:update-guide`, or `/wr-risk-scorer:update-policy` for a targeted edit to an existing guide (voice / style / JTBD / risk policy)."
150
- 18. `Problem — edit existing ticket` — description: "Delegate to `/wr-itil:manage-problem <NNN>` update flow to amend an existing open or known-error ticket with new observations from this session."
151
-
152
- **Default:**
153
- 19. `Skip — not codify-worthy` — description: "Neither stub nor route. The observation is too small, too ambiguous, or a one-off learning that belongs in BRIEFING.md."
154
-
155
- If a single output has accumulated ≥ 3 improvement candidates in one session, prefer offering a single coordinating ticket (`Problem — invoke manage-problem` with an "apply N improvements to X" scope) over recording N separate improvement stubs — this reduces ticket churn and keeps the affected output's improvement queue coherent.
156
-
157
- If an improvement candidate touches multiple unrelated concerns, apply the P016 / P017 concern-boundary split before routing: re-run the `AskUserQuestion` once per concern, each with its own shape + Kind selection. This mirrors the concern-boundary analysis used when creating new problem tickets.
158
-
159
- If the option count is impractical for a single `AskUserQuestion` payload in a given Claude Code version, fall back to a two-question flow: (1) `"Which shape fits?"` with the shape list, (2) `"Create, improve, or skip?"` with `Create stub / Improvement stub / Invoke dedicated skill / Skip` — but prefer the single call when the surface allows it.
160
-
161
- When the user chooses any of the **Create stub** shapes (skill / agent / hook / settings / script / CI / test / memory), record a candidate entry in the Step 5 summary under "Codification Candidates" with:
150
+ - Options (exactly four top-level per ADR-013 Rule 1 cap; architect Q4 lean (b): free-text capture for multi-shape cases, not cascading AskUserQuestion batches — cascading fan-outs are the P061 anti-pattern):
151
+ 1. `Skill — create stub` — description: "Record a stub for a new skill (suggested name, scope, triggers, prior uses) on the ticket's `## Fix Strategy` section. Skill scaffolding itself remains out of scope for the retrospective."
152
+ 2. `Skill — improvement stub` — description: "Record a targeted edit to an existing skill's SKILL.md (target file, observed flaw, edit summary) on the ticket's `## Fix Strategy` section."
153
+ 3. `Other codification shape` — description: "Capture the fix shape as **free-text** on the ticket's `## Fix Strategy` section. Covers agent / hook / settings / script / CI / ADR / JTBD / guide / test fixture / memory / internal code change. Include the shape name, suggested stub details, and routing target where applicable (e.g. `/wr-architect:create-adr` for ADR; `/wr-jtbd:update-guide` for JTBD; `/wr-voice-tone:update-guide` for voice)."
154
+ 4. `Self-contained work no codification stub` — description: "The ticket is a bounded one-shot edit with no recurring-pattern signal. **Rule 6 audit note**: this option is valid only when the observation is a bounded one-shot edit with no recurring-pattern signal. It is NOT a silent-skip escape hatch — if any recurring-pattern signal is present, pick Option 1/2/3 instead so P044's recommend-skills intent is preserved."
155
+
156
+ **Recording**: append a `## Fix Strategy` section to the ticket (or edit the existing section if present). The section records the chosen Option, the shape, and the stub fields (for Options 1–2 the stub template; for Option 3 the free-text fix shape; for Option 4 the bounded-one-shot reason). The fix strategy lives on the ticket — not in the retro summary — so it travels with the problem through its lifecycle.
157
+
158
+ **Non-interactive / AFK branch (ADR-013 Rule 6 + ADR-032 deferred-question contract)**: When `AskUserQuestion` is unavailable, Stage 2 defers via the ADR-032 **deferred-question artefact** each ticket gets a pending-question entry asking for the proposed fix strategy; the main agent surfaces the questions on the next interactive session. Stage 2 does NOT fabricate a fix-strategy choice in AFK mode; the ticket lives without a `## Fix Strategy` section until the user answers. ADR-032's FIFO concurrency handling applies: N pending questions (one per Stage 1 ticket) queue in serial order.
159
+
160
+ #### Stub templates by Option
161
+
162
+ When Stage 2 selects Option 1 (`Skillcreate stub`), write the following into the ticket's `## Fix Strategy` section:
163
+
162
164
  - **Kind** — `create`
163
- - **Shape** — which codification type (skill, agent, hook, etc.)
164
- - **Suggested name** — for skills: `wr-<plugin>:<action>`; for agents: `<plugin>:<name>`; for hooks: `<event>:<trigger>`; for scripts: `scripts/<name>.<ext>`; etc.
165
- - **Scope** — one sentence on what the codification does and when it should fire
166
- - **Triggers** — example user prompts or events that should invoke it
167
- - **Prior uses** — 2-3 observed invocations from this session
165
+ - **Shape** — `skill`
166
+ - **Suggested name** — `wr-<plugin>:<verb>-<object>` per ADR-010 amended skill-granularity rule.
167
+ - **Scope** — one sentence on what the skill does and when it should fire.
168
+ - **Triggers** — 2-3 example user prompts or events.
169
+ - **Prior uses** — 2-3 observed invocations from this session.
170
+
171
+ When Stage 2 selects Option 2 (`Skill — improvement stub`), write:
168
172
 
169
- When the user chooses any of the **Improvement stub** shapes (skill / agent / hook), record a candidate entry in the Step 5 summary under "Codification Candidates" with:
170
173
  - **Kind** — `improve`
171
- - **Shape** — which existing codifiable is being edited (skill, agent, hook)
172
- - **Target file** — the existing file path (e.g. `packages/itil/skills/manage-problem/SKILL.md`)
173
- - **Observed flaw** — one-sentence description of the gap, friction, or defect
174
- - **Edit summary** — one-sentence description of the proposed targeted edit
175
- - **Evidence** — 1-3 observations from this session showing the flaw
174
+ - **Shape** — `skill`
175
+ - **Target file** — existing SKILL.md path (e.g. `packages/itil/skills/manage-problem/SKILL.md`).
176
+ - **Observed flaw** — one sentence.
177
+ - **Edit summary** — one sentence describing the targeted edit.
178
+ - **Evidence** — 1-3 session observations showing the flaw.
179
+
180
+ When Stage 2 selects Option 3 (`Other codification shape`), write free-text on the ticket's `## Fix Strategy` section including: the codification shape (agent / hook / settings / script / CI / ADR / JTBD / guide / test fixture / memory / internal code), a suggested stub (`Suggested name:` / `Target file:` / `Event + trigger:` as fits the shape), the routing target skill (`/wr-architect:create-adr`, `/wr-jtbd:update-guide`, `/wr-voice-tone:update-guide`, `/wr-style-guide:update-guide`, `/wr-risk-scorer:update-policy`), and 1-3 session observations. Free-text capture keeps the ADR-013 Rule 1 4-option cap intact without cascading follow-up batches (architect Q4 lean (b); P061 anti-pattern avoided).
181
+
182
+ When Stage 2 selects Option 4 (`Self-contained work — no codification stub`), write a single-line note: "Self-contained work — bounded one-shot edit, no recurring-pattern signal observed this session." The `## Fix Strategy` section records the Option 4 choice so future sessions know codification was considered and deferred with cause.
176
183
 
177
- When the user chooses any of the **Invoke <dedicated skill>** routes (ADR create / JTBD / Guide / Problem) OR the improvement routing options (ADR supersede or amend / Guide improvement edit / Problem edit existing ticket), delegate to the named skill with a context hand-off describing the candidate. Record the routing decision in the Step 5 summary under "Codification Candidates" with Kind (`create` or `improve`), Shape = the routing target, and a `routed to <skill>` marker. For `ADR — supersede or amend`, include the `supersedes ADR-N` hint in the hand-off so create-adr produces the correct MADR header.
184
+ #### Interaction with P044 / P050 / P051 / P068 / P074
178
185
 
179
- When the user chooses **Skip**, record the candidate in the Step 5 summary under "Codification Candidates" with a `skipped` marker so the pattern is still visible in the session audit trail.
186
+ - **P044** (recommend new skills) Stage 2 Option 1 (`Skill create stub`) carries P044's enforcement intent. P044's AFK recommend-skills semantics migrate to the deferred-question fallback (Stage 2 defers per ticket in AFK).
187
+ - **P050** (recommend other codifiables) — Stage 2 Option 3 (`Other codification shape`) carries P050's shape-generalisation into free-text capture.
188
+ - **P051** (improvement axis) — Stage 2 Option 2 (`Skill — improvement stub`) carries P051's improvement axis for skill shape; non-skill improvements ride in Option 3's free-text capture with an explicit `improve` marker.
189
+ - **P068** (verification-close housekeeping) — unaffected. Step 4a stays as-is, independent of Step 4b's restructure.
190
+ - **P074** (pipeline-instability scan) — when P074 ships its Step 2b, detected instability signals feed Stage 1 as additional ticket sources. The ticket-first flow is the natural common funnel P074's RCA identifies.
180
191
 
181
- **Non-interactive fallback (per ADR-013 Rule 6):** if `AskUserQuestion` is unavailable, record each candidate in the Step 5 summary under "Codification Candidates" with a `flagged — not actioned (non-interactive)` marker, noting the identified Kind alongside Shape (e.g. `Kind: improve, Shape: skill, flagged — not actioned (non-interactive)`). Do not create stubs, route to dedicated skills, or scaffold. The user can review the flags and decide when they return. Improvement candidates flagged this way retain the Target file and Observed flaw fields so the user has enough to act on without re-deriving the context.
192
+ #### Coordinating-ticket rule
182
193
 
183
- **Backward compatibility**: "Skill" is retained as one shape among many so existing P044 muscle memory and `run-retro-skill-candidates.bats` continue to hold. Use the singular shape name in the summary (e.g. `Shape: skill`) so legacy greps still match. Improvement-axis rows use the same singular shape names (`Shape: skill, Kind: improve`) so the Shape column stays consistent across both axes.
194
+ If a single target output accumulates 3 improvement observations in one session, Stage 1 should create **one coordinating ticket** scoped as "apply N improvements to <target>" rather than N separate tickets. Stage 2 on that ticket picks Option 2 (for skill-shape targets) or Option 3 with the coordinating-ticket free-text. This reduces ticket churn and keeps the affected output's improvement queue coherent.
184
195
 
185
196
  ### 5. Summary
186
197
 
@@ -56,18 +56,30 @@ setup() {
56
56
  [ "$status" -eq 0 ]
57
57
  }
58
58
 
59
- @test "SKILL.md Step 4b names the structured options for skill-shaped candidates (P044, updated by P050)" {
60
- # P044 required three specific option labels (Create a new skill / Track as a
61
- # problem / Skip — not skill-worthy). P050 generalises: the skill shape now
62
- # appears as a row in the flat shape-prefixed option list ("Skill — create
63
- # stub"). The "track as problem" path becomes the explicit "Problem" row.
64
- # Skip path becomes "Skip not codify-worthy". This test accepts either
65
- # pattern so the P044 regression guard survives P050's generalisation.
59
+ @test "SKILL.md Step 4b names the structured options for skill-shaped candidates (P044, updated by P050, reframed by P075)" {
60
+ # P044 required three specific option labels (Create a new skill / Track as
61
+ # a problem / Skip — not skill-worthy). P050 generalised to "Skill create
62
+ # stub". P075 reframes Step 4b as a two-stage flow: Stage 1 tickets
63
+ # mechanically (the former "Track as a problem" decision disappears
64
+ # because ticketing is no longer a user choice); Stage 2 asks the
65
+ # fix-strategy question with four options. P044's recommend-new-skills
66
+ # intent now rides in Stage 2 Option 1 (`Skill — create stub`). The
67
+ # former "Skip — not skill-worthy / not codify-worthy" path becomes
68
+ # Stage 2 Option 4 (`Self-contained work — no codification stub`) with a
69
+ # Rule 6 audit note preventing silent-skip. Accept all three forms across
70
+ # the P044 / P050 / P075 lineage.
66
71
  run grep -in "Create a new skill\|Skill — create stub\|Skill - create stub" "$SKILL_FILE"
67
72
  [ "$status" -eq 0 ]
68
- run grep -in "Track as a problem ticket\|Problem — invoke manage-problem\|Problem - invoke manage-problem" "$SKILL_FILE"
73
+ # Stage 1 ticketing delegation is mechanical post-P075; assert the
74
+ # delegation path is named in SKILL.md so the P044 "Track as a problem
75
+ # ticket" enforcement intent still has a visible home.
76
+ run grep -in "Track as a problem ticket\|Problem — invoke manage-problem\|/wr-itil:manage-problem\|/wr-itil:capture-problem" "$SKILL_FILE"
69
77
  [ "$status" -eq 0 ]
70
- run grep -in "Skip not skill-worthy\|Skip - not skill-worthy\|Skip — not codify-worthy\|Skip - not codify-worthy" "$SKILL_FILE"
78
+ # The skip-equivalent path. P075 renamed this to "Self-contained work" so
79
+ # P044's escape-hatch pain pattern does not leak; keep the legacy strings
80
+ # in the regex for backward compatibility across the P044 / P050 / P075
81
+ # lineage.
82
+ run grep -in "Skip — not skill-worthy\|Skip - not skill-worthy\|Skip — not codify-worthy\|Skip - not codify-worthy\|Self-contained work — no codification stub\|Self-contained work - no codification stub" "$SKILL_FILE"
71
83
  [ "$status" -eq 0 ]
72
84
  }
73
85
 
@@ -0,0 +1,155 @@
1
+ #!/usr/bin/env bats
2
+ # Doc-lint guard: run-retro SKILL.md Step 4b must implement the two-stage
3
+ # ticket-first flow introduced by P075. Every codify-worthy observation files
4
+ # a problem ticket first (mechanical, Stage 1); the codification shape is
5
+ # recorded as the proposed fix strategy on the ticket (user-interactive,
6
+ # Stage 2). The 19-option flat AskUserQuestion is removed.
7
+ #
8
+ # Structural assertion — Permitted Exception to the source-grep ban
9
+ # (ADR-005 / P011 / ADR-037 contract-assertion pattern). These tests assert
10
+ # that the SKILL.md contract document carries the new two-stage structure
11
+ # and the architect-required audit notes.
12
+ #
13
+ # @problem P075
14
+ # @jtbd JTBD-001 (enforce governance without slowing down — removes redundant ticketing axis)
15
+ # @jtbd JTBD-006 (progress the backlog while I'm away — AFK-safe Rule 6 fallback)
16
+ # @jtbd JTBD-101 (extend the suite with clear patterns)
17
+ # @jtbd JTBD-201 (audit trail of AI-assisted work — every observation lands in the backlog)
18
+ #
19
+ # Cross-reference:
20
+ # P075: docs/problems/075-run-retro-codification-prompt-redundant-every-observation-is-a-problem.open.md
21
+ # P016: concern-boundary split preserved in Stage 1
22
+ # P044 / P050 / P051: reframed by P075 — fix-strategy on a problem ticket, not one option among 19
23
+ # ADR-013 Rule 1 / Rule 6 (docs/decisions/013-structured-user-interaction-for-governance-decisions.proposed.md)
24
+ # ADR-032 (docs/decisions/032-governance-skill-invocation-patterns.proposed.md) —
25
+ # foreground-spawns-N-background-fanout pattern and deferred-question contract
26
+ # ADR-037 (docs/decisions/037-skill-testing-strategy.proposed.md) — contract-assertion bats pattern
27
+
28
+ setup() {
29
+ SKILL_DIR="$(cd "$(dirname "$BATS_TEST_FILENAME")/.." && pwd)"
30
+ SKILL_FILE="${SKILL_DIR}/SKILL.md"
31
+ }
32
+
33
+ @test "SKILL.md Step 4b has Stage 1 and Stage 2 headings (P075)" {
34
+ # Two-stage flow is the load-bearing shape of the P075 fix.
35
+ run grep -inE "Stage 1[: —-]|### Stage 1|\*\*Stage 1\*\*" "$SKILL_FILE"
36
+ [ "$status" -eq 0 ]
37
+ run grep -inE "Stage 2[: —-]|### Stage 2|\*\*Stage 2\*\*" "$SKILL_FILE"
38
+ [ "$status" -eq 0 ]
39
+ }
40
+
41
+ @test "SKILL.md Step 4b Stage 1 delegates to manage-problem (P075)" {
42
+ # Stage 1 fires mechanically: every codify-worthy observation becomes a
43
+ # problem ticket via the ticket-creation delegation path. Accept either the
44
+ # foreground manage-problem path or the future capture-problem background
45
+ # path (ADR-032 sibling).
46
+ run grep -inE "/wr-itil:manage-problem|/wr-itil:capture-problem" "$SKILL_FILE"
47
+ [ "$status" -eq 0 ]
48
+ }
49
+
50
+ @test "SKILL.md Step 4b Stage 1 is mechanical — no ticketing AskUserQuestion (P075)" {
51
+ # The P075 user directive: "The answer should always be 'create a problem
52
+ # ticket' so the question is redundant." Stage 1 must NOT present
53
+ # ticketing as an AskUserQuestion option alongside other shapes. The
54
+ # literal "Problem — invoke manage-problem" option from the legacy
55
+ # 19-option flat list must be gone.
56
+ run grep -in "Problem — invoke manage-problem\|Problem - invoke manage-problem" "$SKILL_FILE"
57
+ [ "$status" -ne 0 ]
58
+ }
59
+
60
+ @test "SKILL.md Step 4b Stage 2 uses AskUserQuestion with 'Proposed fix' header (P075)" {
61
+ # Stage 2 is the per-ticket fix-strategy prompt. Architect-pinned header is
62
+ # "Proposed fix" to distinguish from the legacy "Codification candidate" header.
63
+ run grep -inE "header:[[:space:]]*['\"]Proposed fix['\"]|\"Proposed fix\"" "$SKILL_FILE"
64
+ [ "$status" -eq 0 ]
65
+ }
66
+
67
+ @test "SKILL.md Step 4b Stage 2 has four top-level architect-pinned options (ADR-013 Rule 1 cap)" {
68
+ # ADR-013 Rule 1: AskUserQuestion options capped at 4. Architect review
69
+ # flagged cascading follow-ups as an anti-pattern (P061 precedent); the
70
+ # architect lean is free-text capture on the Fix Strategy section rather
71
+ # than N-deep fan-out. Stage 2 must list the four architect-pinned option
72
+ # labels as numbered Markdown list items.
73
+ run grep -cE "^[[:space:]]*[0-9]+\.[[:space:]]+\`?(Skill — create stub|Skill — improvement stub|Other codification shape|Self-contained work)" "$SKILL_FILE"
74
+ [ "$status" -eq 0 ]
75
+ [ "$output" -ge 4 ]
76
+ }
77
+
78
+ @test "SKILL.md Step 4b Stage 2 option 4 is 'Self-contained work — no codification stub' (P075 + architect Q2)" {
79
+ # Architect Q2: the "no codification" option must not re-create the P044
80
+ # escape hatch. Required rename to "Self-contained work — no codification
81
+ # stub" with a Rule 6 audit note clarifying the valid scope.
82
+ run grep -inE "Self-contained work[[:space:]]*—[[:space:]]*no codification stub" "$SKILL_FILE"
83
+ [ "$status" -eq 0 ]
84
+ }
85
+
86
+ @test "SKILL.md Step 4b Stage 2 option 4 carries a Rule 6 audit note (P075 + architect Q2)" {
87
+ # Architect-required audit note: Option 4 valid only when the problem is a
88
+ # bounded one-shot edit with no recurring-pattern signal. Protects P044's
89
+ # recommend-skills intent from leaking.
90
+ run grep -inE "bounded one-shot|no recurring[- ]pattern signal|not a recurring pattern|one-shot edit" "$SKILL_FILE"
91
+ [ "$status" -eq 0 ]
92
+ }
93
+
94
+ @test "SKILL.md Step 4b Stage 2 option 3 captures 'Other codification shape' via free-text (P075 + architect Q4)" {
95
+ # Architect Q4 lean (b): free-text Fix Strategy capture, not cascading
96
+ # AskUserQuestion batches. The option must name the free-text capture
97
+ # path so downstream readers can tell cascading was rejected.
98
+ run grep -inE "Other codification shape.*free[- ]text|free[- ]text.*Fix Strategy|free[- ]text capture" "$SKILL_FILE"
99
+ [ "$status" -eq 0 ]
100
+ }
101
+
102
+ @test "SKILL.md Step 4b records proposed fix on the ticket's '## Fix Strategy' section (P075)" {
103
+ # Fix strategy lives on the ticket, not in the retro summary. This is the
104
+ # structural invariant that decouples ticketing (Stage 1) from codification
105
+ # (Stage 2).
106
+ run grep -inE "## Fix Strategy|Fix Strategy section" "$SKILL_FILE"
107
+ [ "$status" -eq 0 ]
108
+ }
109
+
110
+ @test "SKILL.md Step 4b cites ADR-032 for Stage 1 foreground-spawns-N-background fanout (architect Q1)" {
111
+ # Architect Q1: Stage 1 is a legitimate foreground-spawns-N-background case
112
+ # that ADR-032 does not currently contemplate. The SKILL.md must cite
113
+ # ADR-032 explicitly so the cross-plugin ownership boundary is legible.
114
+ run grep -inE "ADR-032|032-governance-skill-invocation-patterns" "$SKILL_FILE"
115
+ [ "$status" -eq 0 ]
116
+ }
117
+
118
+ @test "SKILL.md Step 4b Stage 2 AFK fallback defers via deferred-question contract (P075 + architect Q3)" {
119
+ # Architect Q3: ADR-032 lines 90-110 define the deferred-question artefact
120
+ # shape (pending-questions + resumption context, FIFO). Stage 2 AFK branch
121
+ # must cite that contract, not re-invent a new artefact.
122
+ run grep -inE "deferred[- ]question|deferred question|pending[- ]question" "$SKILL_FILE"
123
+ [ "$status" -eq 0 ]
124
+ }
125
+
126
+ @test "SKILL.md Step 4b Stage 1 preserves P016 concern-boundary split (P075)" {
127
+ # Stage 1's mechanical ticketing must still apply the concern-boundary rule
128
+ # so multi-concern observations split into separate tickets, not land as one
129
+ # conflated ticket.
130
+ run grep -inE "P016|concern[- ]boundary" "$SKILL_FILE"
131
+ [ "$status" -eq 0 ]
132
+ }
133
+
134
+ @test "SKILL.md Step 4b retains ADR-013 Rule 6 non-interactive fallback (P075)" {
135
+ # Rule 6 must still apply: when AskUserQuestion is unavailable, Stage 2
136
+ # defers; Stage 1 fires regardless because it is mechanical.
137
+ run grep -inE "Rule 6|non-interactive" "$SKILL_FILE"
138
+ [ "$status" -eq 0 ]
139
+ }
140
+
141
+ @test "SKILL.md Step 4b drops the legacy 19-option flat list (P075 regression guard)" {
142
+ # The legacy flat list had option rows like "Settings — propose entry",
143
+ # "CI — propose step", "Test fixture — create stub", "Memory — propose
144
+ # note". Those rows are replaced by Stage 2 Option 3's free-text capture.
145
+ # This test is the regression guard — the legacy rows must not survive
146
+ # the rewrite.
147
+ run grep -inE "^[[:space:]]+[0-9]+\.[[:space:]]+\`?Settings[[:space:]]*—[[:space:]]*propose entry" "$SKILL_FILE"
148
+ [ "$status" -ne 0 ]
149
+ run grep -inE "^[[:space:]]+[0-9]+\.[[:space:]]+\`?CI[[:space:]]*—[[:space:]]*propose step" "$SKILL_FILE"
150
+ [ "$status" -ne 0 ]
151
+ run grep -inE "^[[:space:]]+[0-9]+\.[[:space:]]+\`?Test fixture[[:space:]]*—[[:space:]]*create stub" "$SKILL_FILE"
152
+ [ "$status" -ne 0 ]
153
+ run grep -inE "^[[:space:]]+[0-9]+\.[[:space:]]+\`?Memory[[:space:]]*—[[:space:]]*propose note" "$SKILL_FILE"
154
+ [ "$status" -ne 0 ]
155
+ }