npm - @windyroad/retrospective - Versions diffs - 0.8.0 → 0.9.0-preview.210 - Mend

@windyroad/retrospective 0.8.0 → 0.9.0-preview.210

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

package/.claude-plugin/plugin.json +1 -1
package/package.json +1 -1
package/skills/run-retro/SKILL.md +122 -2
package/skills/run-retro/test/run-retro-anti-pattern-clause.bats +87 -0
package/skills/run-retro/test/run-retro-signal-vs-noise.bats +129 -0

package/.claude-plugin/plugin.json CHANGED Viewed

@@ -1,5 +1,5 @@
 {
   "name": "wr-retrospective",
-  "version": "0.8.0",
+  "version": "0.9.0",
   "description": "Session retrospective reminders and plan review for Claude Code"
 }

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@windyroad/retrospective",
-  "version": "0.8.0",
+  "version": "0.9.0-preview.210",
   "description": "Session retrospectives that update briefings and create problem tickets",
   "bin": {
     "windyroad-retrospective": "./bin/install.mjs"

package/skills/run-retro/SKILL.md CHANGED Viewed

@@ -8,12 +8,77 @@ allowed-tools: Read, Write, Edit, Bash, Glob, Grep, AskUserQuestion, Skill
 Reflect on the current session, update the project briefing, and create problem tickets for failures and friction.
+## When to use
+### Supported invocation surfaces
+- **Foreground `/wr-retrospective:run-retro`** — the canonical invocation. The user types the slash command in their parent session; the retro runs with full visibility of the session's tool-call history. This is the only invocation surface every other use case falls back to.
+- **`claude -p` subprocess invocation** — supported per **P086** (the AFK `/wr-itil:work-problems` iteration subprocess invokes run-retro before emitting `ITERATION_SUMMARY`). The subprocess has the iteration's tool-call history naturally; retro runs with iteration-bounded scope and produces correct findings for that scope. ADR-032 subprocess-boundary variant covers this surface.
+### Anti-pattern: Never invoke as a background agent
+Do **NOT** invoke run-retro via `Agent(run_in_background: true)` or any background-subagent surface (the deferred ADR-032 `capture-retro` sibling). Background subagents have isolated context at spawn — they cannot see the parent session's tool-call history, which is run-retro's primary input. A background retro would either produce empty findings, require explicit context-marshalling at spawn (the "shenanigans" the user direction rejected), or post-hoc parse session logs (out of scope today).
+The `/wr-retrospective:capture-retro` background sibling listed in early ADR-032 drafts is **deferred pending resolution of the context-marshalling problem** (P088, 2026-04-21 user direction: *"run-retro cannot be done as a subagent, because it won't have the context"*). The other ADR-032 background siblings (`capture-problem`, `capture-adr`) are unaffected — their inputs are self-contained aside payloads, not whole-session histories. See **ADR-032** in-scope-list amendment and **P088** ticket for the full settlement.
+This anti-pattern clause does NOT forbid retro inside an AFK iteration subprocess (P086) — that surface is the `claude -p` row above, not the background-agent row. Those two surfaces are distinct: `claude -p` is a fresh main Claude Code session that loads its own context naturally; `Agent(run_in_background: true)` is a subagent spawned inside an existing session whose context is isolated from the parent.
 ## Steps
 ### 1. Read the current briefing
 Read `docs/briefing/README.md` — the per-topic index, Critical Points summary, and per-file hooks. Then read each topic file referenced in the Topic Index (`docs/briefing/<topic>.md`) to understand what previous sessions captured under each heading. During the P100 transition window the legacy single-file `docs/BRIEFING.md` may still exist as a stub pointer; it is read-only until P100 slice 2 retires it.
+### 1.5. Briefing signal-vs-noise pass (P105)
+After reading the briefing tree, score every entry in `docs/briefing/*.md` to decide whether it was **signal** (useful this session), **noise** (loaded but not useful), or **decay-only** (not in context at all). This pass drives the Critical Points roll-up curation that the SessionStart hook consumes.
+**Scoring rules** (applied per entry, per retro cycle):
+| Event | Delta | Trigger |
+|-------|-------|---------|
+| Signal | +2 | Entry was cited, paraphrased, or acted on during this session. |
+| Noise | -1 | Entry was loaded into context but not cited or acted on. |
+| Decay | -1 | Applied to **all** entries every retro cycle, regardless of signal/noise status. |
+**Grounding requirement (ADR-026)**: every classification MUST carry a specific citation — the tool invocation, reasoning paraphrase, or session position that exercised (or failed to exercise) the entry. Bare classifications are forbidden. The citation is recorded in the retro summary (Step 5) so the user can audit the agent's judgment.
+**Thresholds and actions**:
+| Score range | Action |
+|-------------|--------|
+| >= +3 | Promote to Critical Points candidate. The agent adds the entry to the Critical Points roll-up in `docs/briefing/README.md` during Step 3. |
+| 0 .. +2 | Keep in the topic file. No roll-up change. |
+| <= -3 | Route to the **delete queue**. These entries are surfaced for user confirmation in a single batched `AskUserQuestion` at the end of this step. |
+**Per-entry persistence format**: each briefing entry carries a trailing HTML comment block:
+```markdown
+- Entry text body goes here.
+  <!-- signal-score: 2 | last-classified: 2026-04-22 | first-written: 2026-04-15 -->
+```
+The comment block is appended to the list item (or heading) that contains the entry text. `first-written` is set when the entry is created and never changed; `last-classified` and `signal-score` are updated each retro. If an entry lacks a comment block, treat `signal-score` as `0` and set `first-written` to today.
+**Classification ownership (policy-authorised per ADR-013 Rule 5)**: the agent owns silent classification. No `AskUserQuestion` is fired for individual entry promotions, demotions, or keep decisions. The agent applies the ADR-026 heuristic directly: entry cited in a tool call (or paraphrased in reasoning) during the session = signal; never loaded or loaded-but-unused = noise; ambiguous cases still classify but with a tentative flag the next retro resolves.
+**Delete queue confirmation**: after scoring all entries, if any entries have a score <= -3:
+1. Present a single `AskUserQuestion` with `header: "Delete briefing entries?"` and `multiSelect: false`.
+2. The question body lists each delete candidate with its score and the ADR-026 citation that led to the noise classification.
+3. Options (up to 4 per prompt, sequential if > 4):
+   1. `Confirm all deletions` — description: "Remove all listed entries from their topic files."
+   2. `Delete selected only` — description: "The agent will present a follow-up with per-entry checkboxes."
+   3. `Keep all (defer to next retro)` — description: "Leave entries in place; scores remain unchanged."
+   4. `Review individually` — description: "Present each entry one at a time for keep/delete decision."
+4. If the queue is empty, skip the prompt entirely.
+If the user chooses `Delete selected only` or `Review individually`, present subsequent `AskUserQuestion` calls as needed, respecting the 4-option cap per ADR-013 Rule 1.
+**Tier 1 budget guard**: if promoting all score >= +3 entries would breach the 2 KB / ~10-bullet Critical Points budget (ADR-040), promote only the highest-scored entries until the budget is met and surface the remainder as a budget-overflow advisory in the retro summary.
+**Non-interactive / AFK fallback (ADR-013 Rule 6)**: when `AskUserQuestion` is unavailable, classify silently and defer the delete queue to the retro summary (Step 5). Do NOT auto-delete entries in AFK mode. The retro summary's "Signal-vs-Noise Pass" section lists each delete candidate with score and citation so the user can review on return. Same trust-boundary shape as Step 2b and Step 4a.
 ### 2. Reflect on this session
 Consider the work done in this session and identify:
@@ -64,7 +129,7 @@ The shape mirrors P068's Step 4a Verification-close housekeeping: glob / evidenc
 **Signal categories** — each detection is tagged with the primary category. A detection may match multiple categories; pick the one whose fix path is most concrete.
-1. **Hook-protocol friction** — gate-marker TTL expiries mid-work (e.g. architect-hook 1800s TTL per ADR-009 expiring while drafting a long file), marker-vs-file deadlocks (a gate demands PASS before a Write; the agent refuses to PASS on a file that doesn't exist yet), hook-exemption scope gaps, hooks firing on paths they shouldn't, hooks silently skipping paths they should.
+1. **Hook-protocol friction** — gate-marker TTL expiries mid-work (e.g. architect-hook 3600s TTL per ADR-009 expiring while drafting a very long file — was 1800s before P107), marker-vs-file deadlocks (a gate demands PASS before a Write; the agent refuses to PASS on a file that doesn't exist yet), hook-exemption scope gaps, hooks firing on paths they shouldn't, hooks silently skipping paths they should.
 2. **Skill-contract violations** — skill steps that collide (e.g. ADR-027 Step 0 colliding with ADR-031 auto-migration Step 0), skills that return empty on paths they should handle (e.g. work-problems false-zero-bail on flat-layout adopter repos), skills whose AskUserQuestion options exceed the 4-option cap (per P061), skills that silently swallow error states the contract says should halt.
 3. **Release-path instability** — `push:watch` / `release:watch` misbehaviour (P054, P060 class — reporting success on a stale SHA's workflow run), changeset authoring defects (P073), release-PR body issues, npm publish failing on metadata mismatch.
 4. **Subagent-delegation friction** — architect / jtbd / risk-scorer / style-guide / voice-tone agents returning `DEFERRED` or `ISSUES FOUND` that block progress, PASS markers failing to write, agent prompts timing out, agent outputs missing the specific citations ADR-026 requires.
@@ -117,10 +182,43 @@ For each accepted learning:
 After editing topic files, update `docs/briefing/README.md`:
 - Refresh per-file summaries in the Topic Index if the topic file's character changed.
-- Promote an entry into the Critical Points section only when it is genuinely among the highest-value rules of the session (the session-start surface is small and curated — adding there is a user-interactive decision per the helpfulness rating slice 2 will add).
+- Promote an entry into the Critical Points section when its signal-score is >= +3 (agent-driven per Step 1.5). The session-start surface is small and curated; the agent promotes the highest-scored entries first, respecting the Tier 1 budget guard. Demotion from Critical Points happens automatically when an entry's score drops below +3 after decay. The remaining user-interactive boundary is the delete queue (score <= -3), which requires explicit user confirmation.
 Use the AskUserQuestion tool to confirm any removals: "I would like to remove [item] from `docs/briefing/<topic>.md` because [reason]. Is this correct?"
+#### Tier 3 budget rotation pass (P099)
+After all topic-file edits, Step 1.5 delete-queue persistence, and the README refresh have completed, run the per-topic-file budget pass. ADR-040 Tier 3 names a 2-5 KB / topic envelope; this pass promotes that budget from informational to advisory enforcement.
+**Mechanism**: invoke `packages/retrospective/scripts/check-briefing-budgets.sh` (read-only diagnostic) against `docs/briefing/`. Each line of output identifies a topic file at or above the configured threshold:
+```
+OVER <basename> bytes=<N> threshold=<N>
+```
+The script's threshold defaults to `5120` bytes (the upper bound of ADR-040's Tier 3 envelope) and is overridable via `BRIEFING_TIER3_MAX_BYTES`. Empty stdout means no files are over budget — skip the rest of this pass.
+**Ordering**: this pass runs as the FINAL action of Step 3, after edits + Step 1.5 delete-queue persistence + README refresh. It must observe post-edit byte counts so the deletes the user confirmed in Step 1.5 are reflected in the measurement.
+**Interactive path (ADR-013 Rule 1)** — for each `OVER` line, invoke `AskUserQuestion`:
+- `header: "Rotate over-budget topic file?"`
+- `multiSelect: false`
+- The question body MUST cite the specific byte count and threshold from the script's output (per ADR-026 grounding) plus a one-line summary of what the file currently covers, so the user can pick a rotation shape without reading the full file.
+- Options (exactly four per ADR-013 Rule 1 cap):
+  1. `Split by sub-topic` — description: "Identify a coherent sub-topic in this file and migrate its entries to a new `docs/briefing/<sub-topic>.md` archive. Update README Topic Index. The agent surfaces a proposed sub-topic boundary; the user confirms in a follow-up question if needed."
+  2. `Split by date — archive oldest` — description: "Move entries older than a chosen cutoff date to a sibling archive (e.g. `docs/briefing/<topic>-archive.md`). The agent surfaces a proposed cutoff drawn from the entry HTML comment block (`first-written` field per Step 1.5)."
+  3. `Trim noise out of band` — description: "Score-and-delete in a follow-up retro is the right shape — defer rotation, leave the file as-is, and let Step 1.5's signal-vs-noise pass shrink it across cycles. Use this when no clean split boundary exists."
+  4. `Defer — record only` — description: "Surface the over-budget state in this retro's summary; take no action this session. Picks up next retro."
+The four options correspond to the four common rotation shapes for accumulator docs. The user's choice is recorded in the retro summary (Step 5) under the new Topic File Rotation section.
+**Non-interactive / AFK fallback (ADR-013 Rule 6)** — when `AskUserQuestion` is unavailable (autonomous retro, AFK orchestrator), do NOT auto-rotate. Each `OVER` line is recorded in the retro summary's "Topic File Rotation Candidates" section with the specific byte count, threshold, and one-line file summary. The user reviews on return and re-runs `/wr-retrospective:run-retro` interactively (or applies a manual split / archive) per accepted candidate. Same trust-boundary shape as Step 1.5's delete queue — surface the evidence; defer the decision.
+**Why advisory, not fail-closed**: the rotation is a judgment call (which sub-topic to extract, which archive shape to use). A CI-fail-on-overflow would block routine retros mid-session, directly violating JTBD-001 ("enforce governance without slowing down"). The advisory shape mirrors ADR-038's chosen response to the analogous honour-system byte-budget problem: bats catch script-contract drift; the script itself surfaces signal at runtime without halting.
+**Reusable pattern note** (JTBD-101): this triplet — read-only advisory script + behavioural bats fixture + ADR-tier-budget amendment — is the documented shape for any accumulator-doc surface that needs progressive-disclosure enforcement. Future surfaces (risk register per P102, ADR index, problems index) can mirror it without re-deriving.
 ### 4. Create or update problem tickets
 For each item identified in "What was harder than it should have been", "What failed", and "What should we make easier or automate", use the `/problem` skill to:
@@ -259,6 +357,20 @@ Present a summary to the user:
 - Updated: [items modified]
 - README index refreshed: [per-file summaries or Critical Points changes]
+### Signal-vs-Noise Pass (P105)
+(Emitted only when Step 1.5 scored briefing entries. Always present when run-retro is invoked — the pass runs regardless of other outcomes. In non-interactive / AFK mode, the delete queue is surfaced here instead of firing `AskUserQuestion`.)
+| Entry | Topic file | Old score | New score | Classification | Citation |
+|-------|-----------|-----------|-----------|----------------|----------|
+| <one-line entry summary> | `docs/briefing/<topic>.md` | <old> | <new> | signal / noise / decay-only | <tool call or reasoning paraphrase> |
+**Critical Points changes**: list any entries promoted to or demoted from the Critical Points roll-up.
+**Delete queue** (only when non-empty): list each score <= -3 candidate with its score and citation. In interactive mode, note the user's decision (`confirmed / deferred / kept`). In AFK mode, label each as `deferred to next interactive session`.
+**Budget overflow** (only when triggered): list any score >= +3 entries that were NOT promoted because the Tier 1 budget was already met.
 ### Problems Created/Updated
 - [problem ticket]: [summary]
@@ -278,6 +390,14 @@ Present a summary to the user:
 |--------|----------|-----------|----------|
 | <one-line signal summary> | Hook-protocol friction / Skill-contract violations / Release-path instability / Subagent-delegation friction / Repeat-work friction / Session-wrap silent drops | <specific invocations + session-position markers + observable outcomes> | new ticket via manage-problem / appended to P<NNN> / recorded in retro only / skipped (false positive) / flagged (non-interactive) |
+### Topic File Rotation Candidates
+(Emitted only when Step 3's Tier 3 budget pass surfaced topic files at or above the configured threshold via `check-briefing-budgets.sh`. Omit this section entirely when no candidates were found — or when the interactive path resolved them all during Step 3. Populated in non-interactive / AFK mode per ADR-013 Rule 6 — the user reviews on return and applies the chosen rotation shape per candidate. P099.)
+| Topic file | Bytes | Threshold | Proposed rotation | Decision |
+|------------|-------|-----------|-------------------|----------|
+| `docs/briefing/<topic>.md` | <N> | <N> | split-by-subtopic / split-by-date / trim-noise / defer | applied / deferred / flagged (non-interactive) |
 ### Codification Candidates
 | Kind | Shape | Suggested name / Target file | Scope / Flaw | Triggers / Evidence | Decision |

package/skills/run-retro/test/run-retro-anti-pattern-clause.bats ADDED Viewed

@@ -0,0 +1,87 @@
+#!/usr/bin/env bats
+# P088: run-retro SKILL.md MUST carry a "Never invoke as a background
+# agent" anti-pattern clause that warns the agent off the
+# Agent(run_in_background: true) surface before it commits to that
+# invocation shape. The clause is the user-direction-settled outcome
+# of P088 ((b)): foreground /wr-retrospective:run-retro is the only
+# supported invocation; `claude -p` subprocess invocation (per P086)
+# remains supported because the subprocess has the iteration's context
+# naturally; background-agent invocation is deferred pending the
+# context-marshalling problem (ADR-032 capture-retro sibling, also
+# deferred).
+#
+# # Test shape: structural contract-assertion (ADR-037 fallback path)
+#
+# The architect-review verdict on P088 (2026-04-26 iter) was:
+# **structural-with-fallback-note, ship this iter**. P081 (architect-
+# design / open) flags structural grep tests on SKILL.md prose as
+# wasteful; the behavioural alternative would programmatically simulate
+# the subagent surface, invoke run-retro, and assert the skill detects
+# the surface and emits an anti-pattern denial. That requires
+# infrastructure (mock subagent stub, mock Agent-tool harness) which
+# does not exist in this repo today.
+#
+# Per ADR-037's "permitted exception" affordance for prose-only
+# contracts, this fixture takes the structural path. P081 follow-up
+# tracks the behavioural-test infrastructure build; once P081 lands a
+# subagent-surface mock, this file's structural assertions become
+# replaceable by behavioural assertions exercising the actual surface
+# detection. Until then, structural is the contract.
+#
+# # @adr ADR-037 fallback — P081 behavioural follow-up tracked.
+# # @ticket P088 — run-retro context-visibility settlement.
+setup() {
+  REPO_ROOT="$(cd "$(dirname "$BATS_TEST_FILENAME")/../../../../.." && pwd)"
+  SKILL_MD="$REPO_ROOT/packages/retrospective/skills/run-retro/SKILL.md"
+}
+@test "run-retro: SKILL.md carries 'Never invoke as a background agent' anti-pattern clause (P088)" {
+  run grep -F 'Never invoke as a background agent' "$SKILL_MD"
+  [ "$status" -eq 0 ]
+}
+@test "run-retro: anti-pattern clause cites P088 as driver" {
+  run grep -F 'P088' "$SKILL_MD"
+  [ "$status" -eq 0 ]
+}
+@test "run-retro: anti-pattern clause names the supported invocation surfaces" {
+  # Foreground /wr-retrospective:run-retro — the canonical invocation.
+  run grep -F 'Foreground' "$SKILL_MD"
+  [ "$status" -eq 0 ]
+  # claude -p subprocess invocation — supported per P086 (subprocess has
+  # iteration context naturally).
+  run grep -F 'claude -p' "$SKILL_MD"
+  [ "$status" -eq 0 ]
+  run grep -F 'P086' "$SKILL_MD"
+  [ "$status" -eq 0 ]
+}
+@test "run-retro: anti-pattern clause names the deferred background-agent surface explicitly" {
+  # The clause MUST mention Agent(run_in_background: true) or the
+  # capture-retro sibling so a future contributor can pattern-match
+  # the surface to the warning.
+  run grep -E 'run_in_background|capture-retro' "$SKILL_MD"
+  [ "$status" -eq 0 ]
+}
+@test "run-retro: anti-pattern clause appears in the preamble (before Step 1)" {
+  # The anti-pattern note belongs near the top of the SKILL so the
+  # agent encounters it before committing to an invocation surface.
+  # Placement requirement: clause appears before the '### 1. Read the
+  # current briefing' section header.
+  pos_clause=$(grep -n 'Never invoke as a background agent' "$SKILL_MD" | head -1 | cut -d: -f1)
+  pos_step1=$(grep -n '^### 1\. Read the current briefing' "$SKILL_MD" | head -1 | cut -d: -f1)
+  [ -n "$pos_clause" ]
+  [ -n "$pos_step1" ]
+  [ "$pos_clause" -lt "$pos_step1" ]
+}
+@test "run-retro: anti-pattern clause cross-references ADR-032 capture-retro deferral" {
+  # The clause should pin the ADR amendment so a contributor reading
+  # the SKILL can trace the deferral decision back to the ADR.
+  run grep -F 'ADR-032' "$SKILL_MD"
+  [ "$status" -eq 0 ]
+}

package/skills/run-retro/test/run-retro-signal-vs-noise.bats ADDED Viewed

@@ -0,0 +1,129 @@
+#!/usr/bin/env bats
+# P105: run-retro SKILL.md documents a signal-vs-noise pass (Step 1.5) that
+# scores every briefing entry per retro cycle, drives Critical Points curation,
+# and gates deletion behind a batched user-confirmation queue.
+#
+# Doc-lint structural test (Permitted Exception per ADR-005). Asserts
+# SKILL.md wording for: the step header, placement between Step 1 and Step 2,
+# the three scoring rules (signal +2, noise -1, decay -1), the HTML comment
+# persistence format, the ADR-026 grounding requirement, the threshold
+# actions (promote/keep/delete), the Tier 1 budget guard, the delete-queue
+# AskUserQuestion contract (ADR-013 Rule 1), the AFK fallback (ADR-013 Rule 6),
+# the Step 3 agent-driven promotion update, and the Step 5 summary integration.
+setup() {
+  REPO_ROOT="$(cd "$(dirname "$BATS_TEST_FILENAME")/../../../../.." && pwd)"
+  SKILL_MD="$REPO_ROOT/packages/retrospective/skills/run-retro/SKILL.md"
+}
+@test "run-retro: SKILL.md contains Step 1.5 Briefing signal-vs-noise pass (P105)" {
+  run grep -F '### 1.5. Briefing signal-vs-noise pass (P105)' "$SKILL_MD"
+  [ "$status" -eq 0 ]
+}
+@test "run-retro: Step 1.5 placement between Step 1 and Step 2" {
+  pos_1=$(grep -n '^### 1\. ' "$SKILL_MD" | head -1 | cut -d: -f1)
+  pos_1_5=$(grep -n '^### 1\.5\. ' "$SKILL_MD" | head -1 | cut -d: -f1)
+  pos_2=$(grep -n '^### 2\. ' "$SKILL_MD" | head -1 | cut -d: -f1)
+  [ -n "$pos_1" ]
+  [ -n "$pos_1_5" ]
+  [ -n "$pos_2" ]
+  [ "$pos_1" -lt "$pos_1_5" ]
+  [ "$pos_1_5" -lt "$pos_2" ]
+}
+@test "run-retro: Step 1.5 documents all three scoring events" {
+  run grep -F 'Signal | +2' "$SKILL_MD"
+  [ "$status" -eq 0 ]
+  run grep -F 'Noise | -1' "$SKILL_MD"
+  [ "$status" -eq 0 ]
+  run grep -F 'Decay | -1' "$SKILL_MD"
+  [ "$status" -eq 0 ]
+}
+@test "run-retro: Step 1.5 documents the HTML comment persistence format" {
+  run grep -F 'signal-score:' "$SKILL_MD"
+  [ "$status" -eq 0 ]
+  run grep -F 'last-classified:' "$SKILL_MD"
+  [ "$status" -eq 0 ]
+  run grep -F 'first-written:' "$SKILL_MD"
+  [ "$status" -eq 0 ]
+}
+@test "run-retro: Step 1.5 requires ADR-026 grounding for classifications" {
+  run grep -F 'ADR-026' "$SKILL_MD"
+  [ "$status" -eq 0 ]
+  run grep -F 'specific citation' "$SKILL_MD"
+  [ "$status" -eq 0 ]
+  run grep -F 'Bare classifications are forbidden' "$SKILL_MD"
+  [ "$status" -eq 0 ]
+}
+@test "run-retro: Step 1.5 documents promote threshold (score >= +3)" {
+  run grep -F '>= +3' "$SKILL_MD"
+  [ "$status" -eq 0 ]
+  run grep -F 'Promote to Critical Points' "$SKILL_MD"
+  [ "$status" -eq 0 ]
+}
+@test "run-retro: Step 1.5 documents delete queue threshold (score <= -3)" {
+  run grep -F '<= -3' "$SKILL_MD"
+  [ "$status" -eq 0 ]
+  run grep -F 'delete queue' "$SKILL_MD"
+  [ "$status" -eq 0 ]
+}
+@test "run-retro: Step 1.5 delete queue uses single batched AskUserQuestion (ADR-013 Rule 1)" {
+  run grep -F 'Delete briefing entries?' "$SKILL_MD"
+  [ "$status" -eq 0 ]
+  run grep -F 'Confirm all deletions' "$SKILL_MD"
+  [ "$status" -eq 0 ]
+  run grep -F 'Keep all (defer to next retro)' "$SKILL_MD"
+  [ "$status" -eq 0 ]
+}
+@test "run-retro: Step 1.5 AFK fallback defers delete queue to retro summary (ADR-013 Rule 6)" {
+  run grep -F 'Non-interactive / AFK fallback (ADR-013 Rule 6)' "$SKILL_MD"
+  [ "$status" -eq 0 ]
+  run grep -F 'Do NOT auto-delete' "$SKILL_MD"
+  [ "$status" -eq 0 ]
+  run grep -F 'Signal-vs-Noise Pass' "$SKILL_MD"
+  [ "$status" -eq 0 ]
+}
+@test "run-retro: Step 1.5 documents the Tier 1 budget guard" {
+  run grep -F 'Tier 1 budget guard' "$SKILL_MD"
+  [ "$status" -eq 0 ]
+  run grep -F '2 KB' "$SKILL_MD"
+  [ "$status" -eq 0 ]
+}
+@test "run-retro: Step 1.5 cites ADR-040 for the Critical Points budget" {
+  run grep -F 'ADR-040' "$SKILL_MD"
+  [ "$status" -eq 0 ]
+}
+@test "run-retro: Step 1.5 classification is policy-authorised silent (ADR-013 Rule 5)" {
+  run grep -F 'policy-authorised per ADR-013 Rule 5' "$SKILL_MD"
+  [ "$status" -eq 0 ]
+  run grep -F 'agent owns silent classification' "$SKILL_MD"
+  [ "$status" -eq 0 ]
+}
+@test "run-retro: Step 3 acknowledges agent-driven promotion from Step 1.5 score" {
+  run grep -F 'agent-driven per Step 1.5' "$SKILL_MD"
+  [ "$status" -eq 0 ]
+  run grep -F 'signal-score' "$SKILL_MD"
+  [ "$status" -eq 0 ]
+}
+@test "run-retro: Step 5 summary adds a Signal-vs-Noise Pass section" {
+  run grep -F '### Signal-vs-Noise Pass (P105)' "$SKILL_MD"
+  [ "$status" -eq 0 ]
+}
+@test "run-retro: Signal-vs-Noise Pass summary table columns match Step 1.5 output" {
+  run grep -F '| Entry | Topic file | Old score | New score | Classification | Citation |' "$SKILL_MD"
+  [ "$status" -eq 0 ]
+}