npm - @windyroad/retrospective - Versions diffs - 0.1.6 → 0.2.0-preview.109 - Mend

@windyroad/retrospective 0.1.6 → 0.2.0-preview.109

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

package/.claude-plugin/plugin.json +1 -1
package/package.json +1 -1
package/skills/run-retro/SKILL.md +58 -24
package/skills/run-retro/test/run-retro-codification-candidates.bats +102 -0
package/skills/run-retro/test/run-retro-skill-candidates.bats +27 -17

package/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,5 +1,5 @@
 {
   "name": "wr-retrospective",
-  "version": "0.1.6",
+  "version": "0.2.0",
   "description": "Session retrospective reminders and plan review for Claude Code"
 }

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@windyroad/retrospective",
-  "version": "0.1.6",
+  "version": "0.2.0-preview.109",
   "description": "Session retrospectives that update briefings and create problem tickets",
   "bin": {
     "windyroad-retrospective": "./bin/install.mjs"

package/skills/run-retro/SKILL.md CHANGED Viewed

@@ -28,12 +28,28 @@ Consider the work done in this session and identify:
 **What should we make easier or automate** — repetitive manual steps, missing tooling, things that could be scripted. These should become problem tickets via the `/problem` skill.
-**What recurring workflow did I (or the assistant) perform that would be better as a skill?** — multi-step sequences that (a) were invoked multiple times in one session, (b) have a deterministic action order, and (c) are reusable across projects, not project-specific logic. These are **skill candidates**, not problem tickets, and route through Step 4b below.
+**What recurring pattern did I (or the assistant) observe that would be better codified?** — a pattern that (a) was invoked multiple times in one session or across sessions, (b) has a deterministic action order or a clear invariant, and (c) is reusable beyond one project. These are **codification candidates** and route through Step 4b below. Do not treat them as problem tickets unless the user explicitly picks that routing option.
-Examples of skill candidates vs problem tickets:
-- `fetch origin → check changesets → score risk → commit → push → release → sync manifest → mark Fix Released` — deterministic, recurring, cross-project → **skill candidate** (e.g. `wr-itil:ship-fix`).
-- "The commit gate rejected my work twice because X was misconfigured" — diagnostic, project-specific → **problem ticket**.
-- "I always forget to run `npm run verify` before pushing" — short, user-habit rather than codifiable sequence → **BRIEFING.md** note, not a skill.
+For each codification candidate, also identify the **best shape** for the codification. The Windy Road suite supports many shapes — pick the one that fits the pattern, not the one you happened to learn first:
+- **Skill** — deterministic multi-step sequence the user invokes by name (e.g. `wr-itil:ship-fix`). Worked example: `fetch origin → check changesets → score risk → commit → push → release → sync manifest → mark Fix Released`.
+- **Agent** — bounded investigation or review the main agent should delegate to (e.g. a performance-specialist the architect calls in for runtime-path changes). Place under `packages/<plugin>/agents/`.
+- **Hook** — event-driven enforcement or prompt injection (PreToolUse, PostToolUse, UserPromptSubmit). Use when "I keep forgetting to X before Y" — hooks make X unmissable without adding memory load.
+- **Settings entry** — `.claude/settings.json` changes: allowlisted commands, env vars, hook wiring. Best fit when a session repeatedly hits permission prompts for the same benign tool.
+- **Shell or Node script** — reusable repo-level tooling in `scripts/` (e.g. `sync-install-utils.sh`, `sync-plugin-manifests.mjs`). Best fit for multi-step shell sequences worth scripting.
+- **CI step** — `.github/workflows/*.yml` insertion. Best fit for "we'd have caught that earlier with a CI check".
+- **ADR** — architectural decision worth recording. Route to `/wr-architect:create-adr`.
+- **JTBD** — job-to-be-done record for a persona. Route to `/wr-jtbd:update-guide`.
+- **Guide** — voice, style, or risk policy edit. Route to `/wr-voice-tone:update-guide`, `/wr-style-guide:update-guide`, or `/wr-risk-scorer:update-policy`.
+- **Problem ticket** — diagnostic, project-specific friction (the default for flaws). Route to `/wr-itil:manage-problem`.
+- **Test fixture** — regression test for a recurring failure pattern (bats fixture, unit test). Best fit when the observation is "this kept breaking the same way".
+- **Memory** — per-user or per-project memory note in `~/.claude/.../memory/`. Best fit for short, user-habit observations that aren't a codifiable sequence (e.g. "I always forget to run `npm run verify` before pushing").
+If no shape fits — the observation is a one-off learning, not a repeating pattern — it belongs in BRIEFING.md (Step 3), not Step 4b.
+Counter-examples (what does **not** become a codification candidate):
+- "The commit gate rejected my work twice because X was misconfigured" — diagnostic, project-specific → **problem ticket** shape (route via Step 4b).
+- "I always forget to run `npm run verify` before pushing" — short, user-habit rather than codifiable sequence → **memory** shape or **BRIEFING.md** note.
 ### 3. Update BRIEFING.md
@@ -57,31 +73,44 @@ For each item identified in "What was harder than it should have been", "What fa
 - If yes: update it with new evidence from this session
 - If no: create a new problem ticket
-### 4b. Recommend new skills
+### 4b. Recommend new codifications
-For each **skill candidate** identified in Step 2, route the decision through `AskUserQuestion`. This is the ADR-013 Rule 1 structured-interaction pattern — do not present the choices as prose enumeration in the skill output.
+For each **codification candidate** identified in Step 2, route the decision through a single `AskUserQuestion` call. This is the ADR-013 Rule 1 structured-interaction pattern — do not present the choices as prose enumeration in the skill output. The shape identified in Step 2 determines which option rows the user picks from; every shape routes through the same `AskUserQuestion` so the decision stays one structured interaction (architect decision: flat shape-prefixed options, not a two-step type-then-action flow).
 For each candidate, invoke `AskUserQuestion` with:
-- `header: "Skill candidate"`
+- `header: "Codification candidate"`
 - `multiSelect: false`
-- Three options:
-  1. `Create a new skill` — description: "Scaffold a new skill for this recurring workflow. Requires suggested name, scope, and triggers — Step 4b records them for a future scaffolding flow to pick up."
-  2. `Track as a problem ticket instead` — description: "File the candidate via /wr-itil:manage-problem so it can be planned and WSJF-ranked alongside other backlog items."
-  3. `Skip — not skill-worthy` — description: "Neither create nor track. The candidate is too small, too ambiguous, or too project-specific to codify."
-When the user chooses **Create a new skill**, record a candidate entry with:
-- **Suggested skill name** (e.g. `wr-itil:ship-fix`) — kebab-case, namespaced to the owning plugin
-- **Scope** — one sentence on what the skill does and when it should fire
-- **Triggers** — example user prompts that should invoke it
+- Options (a flat list; each option names the shape up front so the decision is auditable):
+  1. `Skill — create stub` — description: "Record a stub candidate (suggested name, scope, triggers, prior uses) for a future scaffolding flow. Skill scaffolding itself is out of scope for this retrospective."
+  2. `Agent — create stub` — description: "Record a stub candidate for a new agent (suggested name, scope, trigger conditions, delegating skill). Place under `packages/<plugin>/agents/` when scaffolded."
+  3. `Hook — create stub` — description: "Record a stub candidate for a new hook (event: PreToolUse / PostToolUse / UserPromptSubmit; trigger; action summary)."
+  4. `Settings — propose entry` — description: "Record a proposed `.claude/settings.json` entry (allowlist / env / hook wiring) for later review."
+  5. `Script — create stub` — description: "Record a stub `scripts/*.sh` or `scripts/*.mjs` candidate (shebang + TODO + scope)."
+  6. `CI — propose step` — description: "Record a proposed `.github/workflows/ci.yml` insertion."
+  7. `ADR — invoke create-adr` — description: "Delegate to `/wr-architect:create-adr` so the decision is captured with proper MADR structure. Routing skill, not a stub."
+  8. `JTBD — invoke update-guide` — description: "Delegate to `/wr-jtbd:update-guide` to add or amend a job-to-be-done record. Routing skill, not a stub."
+  9. `Guide — invoke update-guide / update-policy` — description: "Delegate to `/wr-voice-tone:update-guide`, `/wr-style-guide:update-guide`, or `/wr-risk-scorer:update-policy` depending on the guide touched."
+  10. `Problem — invoke manage-problem` — description: "Delegate to `/wr-itil:manage-problem` so the candidate is WSJF-ranked against other backlog items. Routing skill, not a stub."
+  11. `Test fixture — create stub` — description: "Record a candidate bats / unit-test fixture for the recurring failure pattern."
+  12. `Memory — propose note` — description: "Record a proposed memory note (per-user or per-project) for a short user-habit observation that isn't a codifiable sequence."
+  13. `Skip — not codify-worthy` — description: "Neither stub nor route. The observation is too small, too ambiguous, or a one-off learning that belongs in BRIEFING.md."
+If the option count is impractical for a single `AskUserQuestion` payload in a given Claude Code version, fall back to a two-question flow: (1) `"Which shape fits?"` with the shape list, (2) `"What action?"` with `Create stub / Invoke dedicated skill / Skip` — but prefer the single call when the surface allows it.
+When the user chooses any of the **Create stub** shapes (skill / agent / hook / settings / script / CI / test / memory), record a candidate entry in the Step 5 summary under "Codification Candidates" with:
+- **Shape** — which codification type (skill, agent, hook, etc.)
+- **Suggested name** — for skills: `wr-<plugin>:<action>`; for agents: `<plugin>:<name>`; for hooks: `<event>:<trigger>`; for scripts: `scripts/<name>.<ext>`; etc.
+- **Scope** — one sentence on what the codification does and when it should fire
+- **Triggers** — example user prompts or events that should invoke it
 - **Prior uses** — 2-3 observed invocations from this session
-Store the candidate entry in the Step 5 summary under "Skill Candidates". Skill scaffolding itself is out of scope for this retrospective — scaffolding may land in a future skill (see P012 skill testing harness for the testing side of that).
+When the user chooses any of the **Invoke <dedicated skill>** routes (ADR / JTBD / Guide / Problem), delegate to the named skill with a context hand-off describing the candidate. Record the routing decision in the Step 5 summary under "Codification Candidates" with Shape = the routing target and a `routed to <skill>` marker.
-When the user chooses **Track as a problem ticket**, fall through to the Step 4 flow with a "this should be a skill" note in the ticket description.
+When the user chooses **Skip**, record the candidate in the Step 5 summary under "Codification Candidates" with a `skipped` marker so the pattern is still visible in the session audit trail.
-When the user chooses **Skip**, record the candidate in the Step 5 summary under "Skill Candidates" with a `skipped` marker so the pattern is still visible in the session audit trail.
+**Non-interactive fallback (per ADR-013 Rule 6):** if `AskUserQuestion` is unavailable, record each candidate in the Step 5 summary under "Codification Candidates" with a `flagged — not actioned (non-interactive)` marker, noting the identified Shape. Do not create stubs, route to dedicated skills, or scaffold. The user can review the flags and decide when they return.
-**Non-interactive fallback (per ADR-013 Rule 6):** if `AskUserQuestion` is unavailable, record each candidate in the Step 5 summary under "Skill Candidates" with a `flagged — not actioned (non-interactive)` marker and do not create tickets or scaffold. The user can review the flags and decide when they return.
+**Backward compatibility**: "Skill" is retained as one shape among many so existing P044 muscle memory and `run-retro-skill-candidates.bats` continue to hold. Use the singular shape name in the summary (e.g. `Shape: skill`) so legacy greps still match.
 ### 5. Summary
@@ -98,13 +127,18 @@ Present a summary to the user:
 ### Problems Created/Updated
 - [problem ticket]: [summary]
-### Skill Candidates
-- [suggested name] — [scope]. Triggers: [examples]. Decision: [created / tracked as P<NNN> / skipped / flagged (non-interactive)].
+### Codification Candidates
+| Shape | Suggested name | Scope | Triggers | Decision |
+|-------|---------------|-------|----------|----------|
+| skill | [suggested name] | [scope] | [examples] | created stub / routed to <skill> / skipped / flagged (non-interactive) |
+| agent | ... | ... | ... | ... |
+| hook  | ... | ... | ... | ... |
 ### No Action Needed
 - [learnings that were already captured]
 ```
-If the "Skill Candidates" section is empty, omit it rather than rendering an empty header.
+If the "Codification Candidates" table has no rows, omit it rather than rendering an empty header. The legacy "Skill Candidates" heading is preserved as a worked-example row in the Shape column so downstream tooling that grepped for "Skill Candidates" continues to find skill-shaped entries within the unified table.
 $ARGUMENTS

package/skills/run-retro/test/run-retro-codification-candidates.bats ADDED Viewed

@@ -0,0 +1,102 @@
+#!/usr/bin/env bats
+# Doc-lint guard: run-retro SKILL.md must include a generalised codification
+# branch that recommends agents, hooks, and other codifiable outputs — not
+# only skills. This is the P050 generalisation of P044's single-output-type
+# recommendation surface.
+#
+# Structural assertion — Permitted Exception to the source-grep ban (ADR-005 / P011).
+# These tests assert that the skill specification document includes the
+# multi-shape codification branch introduced by P050.
+#
+# Cross-reference:
+#   P050: docs/problems/050-run-retro-does-not-recommend-other-codifiable-outputs.open.md
+#   P044: docs/problems/044-run-retro-does-not-recommend-new-skills.known-error.md (predecessor)
+#   ADR-013 Rule 1 / Rule 6 (docs/decisions/013-structured-user-interaction-for-governance-decisions.proposed.md)
+#   @jtbd JTBD-101 (extend the suite with clear patterns)
+#   @jtbd JTBD-006 (progress the backlog while I'm away — AFK-safe Rule 6 fallback)
+#   @jtbd JTBD-001 (enforce governance without slowing down)
+setup() {
+  SKILL_DIR="$(cd "$(dirname "$BATS_TEST_FILENAME")/.." && pwd)"
+  SKILL_FILE="${SKILL_DIR}/SKILL.md"
+}
+@test "SKILL.md Step 2 includes a generalised codification reflection category (P050)" {
+  # P050 fix: Step 2 must prompt for recurring patterns that would be better
+  # codified — not only as skills. The generalised wording names "codif"
+  # (codification / codifiable / codified) so reviewers and agents can tell
+  # P050 shipped.
+  run grep -in "codification candidate\|codifiable\|codify\|codified" "$SKILL_FILE"
+  [ "$status" -eq 0 ]
+}
+@test "SKILL.md Step 2 names at least three codification shapes beyond skills (P050)" {
+  # The shape question is the point of P050. Names at minimum: agent, hook,
+  # and one of: settings, script, CI step, ADR, JTBD, guide, test. "skill"
+  # stays in the list as a worked example so P044 muscle memory survives.
+  run grep -ic "agent" "$SKILL_FILE"
+  [ "$status" -eq 0 ]
+  [ "$output" -ge 1 ]
+  run grep -ic "hook" "$SKILL_FILE"
+  [ "$status" -eq 0 ]
+  [ "$output" -ge 1 ]
+  run grep -icE "settings|script|ci step|ADR|JTBD|guide|test fixture" "$SKILL_FILE"
+  [ "$status" -eq 0 ]
+  [ "$output" -ge 1 ]
+}
+@test "SKILL.md Step 4b recommendation branch covers multiple shapes (P050)" {
+  # Step 4b must route per-shape. The recommendation branch names more than
+  # just "skill" as an output type.
+  run grep -inE "(agent|hook).*stub|stub.*(agent|hook)|create.*(agent|hook)|(agent|hook).*candidate" "$SKILL_FILE"
+  [ "$status" -eq 0 ]
+}
+@test "SKILL.md Step 4b uses a single AskUserQuestion with shape-prefixed options (ADR-013 Rule 1)" {
+  # Architect decision: flat AskUserQuestion with type-prefixed labels, not a
+  # two-step chained flow. At least three shape-prefixed option lines must
+  # appear in the Step 4b block. Match both plain and backticked forms:
+  #   "Skill — ..." / "`Skill — ...`" / "**Skill** — ..."
+  run grep -cE "(\`|\*\*)?(Skill|Agent|Hook|Settings|Script|CI|ADR|JTBD|Guide|Problem|Test fixture|Memory)(\`|\*\*)? +(—|-) " "$SKILL_FILE"
+  [ "$status" -eq 0 ]
+  [ "$output" -ge 3 ]
+}
+@test "SKILL.md Step 4b routes ADR candidates to wr-architect:create-adr (P050)" {
+  # Dedicated codification skills already exist — Step 4b must route to them,
+  # not duplicate intake.
+  run grep -in "wr-architect:create-adr\|/wr-architect:create-adr" "$SKILL_FILE"
+  [ "$status" -eq 0 ]
+}
+@test "SKILL.md Step 4b routes JTBD candidates to wr-jtbd:update-guide (P050)" {
+  run grep -in "wr-jtbd:update-guide\|/wr-jtbd:update-guide" "$SKILL_FILE"
+  [ "$status" -eq 0 ]
+}
+@test "SKILL.md Step 4b preserves non-interactive fallback for generalised shapes (ADR-013 Rule 6)" {
+  # The Rule 6 fallback that P044 introduced must cover the generalised surface.
+  # When AskUserQuestion is unavailable, all shape candidates are flagged rather
+  # than silently chosen.
+  run grep -in "non-interactive\|Rule 6" "$SKILL_FILE"
+  [ "$status" -eq 0 ]
+}
+@test "SKILL.md Step 5 summary has a Codification Candidates section with Shape column (P050)" {
+  # P050 candidate fix (3): unified table. Either:
+  #   - a "### Codification Candidates" heading AND a "Shape" column, OR
+  #   - a "### Codification Candidates" heading with per-shape rows.
+  # Accept either shape but require the heading.
+  run grep -n "### Codification Candidates\|## Codification Candidates" "$SKILL_FILE"
+  [ "$status" -eq 0 ]
+  run grep -in "shape" "$SKILL_FILE"
+  [ "$status" -eq 0 ]
+}
+@test "SKILL.md retains 'skill' as a worked example within the generalised category (backward compat with P044)" {
+  # Architect advisory: keep 'skill' in the shape list as one worked example so
+  # the existing P044 muscle memory and the run-retro-skill-candidates.bats
+  # line-30 grep still pass. This is the compatibility test.
+  run grep -in "would be better as a skill\|better as a skill\|as a skill\|skill candidate" "$SKILL_FILE"
+  [ "$status" -eq 0 ]
+}

package/skills/run-retro/test/run-retro-skill-candidates.bats CHANGED Viewed

@@ -23,11 +23,13 @@ setup() {
   [ "$output" = "---" ]
 }
-@test "SKILL.md Step 2 includes the skill-candidate reflection category (P044)" {
-  # P044 fix: Step 2 must prompt for recurring workflows that would be better as skills.
-  # Regression guard: if Step 2 is rewritten and the skill-candidate prompt is dropped,
-  # this test fails.
-  run grep -n "recurring workflow.*better as a skill\|would be better as a skill" "$SKILL_FILE"
+@test "SKILL.md Step 2 includes the skill-candidate reflection category (P044, updated by P050)" {
+  # P044 fix: Step 2 must prompt for recurring workflows that would be better
+  # as skills. P050 generalises this to a codification category, with "skill"
+  # retained as one worked example within the shape list. This test accepts
+  # either the original P044 phrasing OR the P050 generalised phrasing that
+  # still names "skill" as a shape.
+  run grep -in "recurring workflow.*better as a skill\|would be better as a skill\|recurring pattern.*better codified\|\*\*Skill\*\* — " "$SKILL_FILE"
   [ "$status" -eq 0 ]
 }
@@ -46,20 +48,26 @@ setup() {
   [ "$status" -eq 0 ]
 }
-@test "SKILL.md Step 4b header matches ADR-013 structured-interaction pattern" {
-  # Header: "Skill candidate" identifies the AskUserQuestion call site and is what
-  # other tests / review tooling can grep for.
-  run grep -n "Skill candidate" "$SKILL_FILE"
+@test "SKILL.md Step 4b header matches ADR-013 structured-interaction pattern (P044, updated by P050)" {
+  # P044 used "Skill candidate" as the AskUserQuestion header. P050 generalises
+  # to "Codification candidate" with "Skill" as one shape option. Accept either —
+  # both preserve the ADR-013 Rule 1 structured-interaction shape.
+  run grep -in "Skill candidate\|Codification candidate" "$SKILL_FILE"
   [ "$status" -eq 0 ]
 }
-@test "SKILL.md Step 4b names the three structured options (P044)" {
-  # The three decision options must be explicit: create, track, skip.
-  run grep -n "Create a new skill" "$SKILL_FILE"
+@test "SKILL.md Step 4b names the structured options for skill-shaped candidates (P044, updated by P050)" {
+  # P044 required three specific option labels (Create a new skill / Track as a
+  # problem / Skip — not skill-worthy). P050 generalises: the skill shape now
+  # appears as a row in the flat shape-prefixed option list ("Skill — create
+  # stub"). The "track as problem" path becomes the explicit "Problem" row.
+  # Skip path becomes "Skip — not codify-worthy". This test accepts either
+  # pattern so the P044 regression guard survives P050's generalisation.
+  run grep -in "Create a new skill\|Skill — create stub\|Skill - create stub" "$SKILL_FILE"
   [ "$status" -eq 0 ]
-  run grep -n "Track as a problem ticket" "$SKILL_FILE"
+  run grep -in "Track as a problem ticket\|Problem — invoke manage-problem\|Problem - invoke manage-problem" "$SKILL_FILE"
   [ "$status" -eq 0 ]
-  run grep -n "Skip — not skill-worthy\|Skip - not skill-worthy" "$SKILL_FILE"
+  run grep -in "Skip — not skill-worthy\|Skip - not skill-worthy\|Skip — not codify-worthy\|Skip - not codify-worthy" "$SKILL_FILE"
   [ "$status" -eq 0 ]
 }
@@ -70,11 +78,13 @@ setup() {
   [ "$status" -eq 0 ]
 }
-@test "SKILL.md Step 5 summary template has a Skill Candidates slot (P044)" {
+@test "SKILL.md Step 5 summary template has a Skill / Codification Candidates slot (P044, updated by P050)" {
   # P044 fix: the summary template must include a Skill Candidates section so
   # recommendations are visible in the session audit alongside BRIEFING changes
-  # and problem tickets.
-  run grep -n "### Skill Candidates" "$SKILL_FILE"
+  # and problem tickets. P050 generalises this to a unified "Codification
+  # Candidates" table with a Shape column; skill-shaped candidates still appear
+  # (as Shape: skill rows). Accept either heading.
+  run grep -n "### Skill Candidates\|### Codification Candidates" "$SKILL_FILE"
   [ "$status" -eq 0 ]
 }