@codyswann/lisa 2.9.1 → 2.10.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (35) hide show
  1. package/package.json +1 -1
  2. package/plugins/lisa/.claude-plugin/plugin.json +1 -1
  3. package/plugins/lisa/agents/learnings-synthesizer.md +135 -0
  4. package/plugins/lisa/agents/pr-mining-specialist.md +85 -0
  5. package/plugins/lisa/agents/tracker-mining-specialist.md +85 -0
  6. package/plugins/lisa/commands/debrief/apply.md +6 -0
  7. package/plugins/lisa/commands/debrief.md +6 -0
  8. package/plugins/lisa/hooks/enforce-team-first.sh +9 -3
  9. package/plugins/lisa/rules/intent-routing.md +97 -17
  10. package/plugins/lisa/skills/confluence-to-tracker/SKILL.md +14 -0
  11. package/plugins/lisa/skills/debrief/SKILL.md +79 -0
  12. package/plugins/lisa/skills/debrief-apply/SKILL.md +63 -0
  13. package/plugins/lisa/skills/github-to-tracker/SKILL.md +14 -0
  14. package/plugins/lisa/skills/linear-to-tracker/SKILL.md +14 -0
  15. package/plugins/lisa/skills/notion-to-tracker/SKILL.md +14 -0
  16. package/plugins/lisa/skills/prd-backlink/SKILL.md +89 -0
  17. package/plugins/lisa-cdk/.claude-plugin/plugin.json +1 -1
  18. package/plugins/lisa-expo/.claude-plugin/plugin.json +1 -1
  19. package/plugins/lisa-nestjs/.claude-plugin/plugin.json +1 -1
  20. package/plugins/lisa-rails/.claude-plugin/plugin.json +1 -1
  21. package/plugins/lisa-typescript/.claude-plugin/plugin.json +1 -1
  22. package/plugins/src/base/agents/learnings-synthesizer.md +135 -0
  23. package/plugins/src/base/agents/pr-mining-specialist.md +85 -0
  24. package/plugins/src/base/agents/tracker-mining-specialist.md +85 -0
  25. package/plugins/src/base/commands/debrief/apply.md +6 -0
  26. package/plugins/src/base/commands/debrief.md +6 -0
  27. package/plugins/src/base/hooks/enforce-team-first.sh +9 -3
  28. package/plugins/src/base/rules/intent-routing.md +97 -17
  29. package/plugins/src/base/skills/confluence-to-tracker/SKILL.md +14 -0
  30. package/plugins/src/base/skills/debrief/SKILL.md +79 -0
  31. package/plugins/src/base/skills/debrief-apply/SKILL.md +63 -0
  32. package/plugins/src/base/skills/github-to-tracker/SKILL.md +14 -0
  33. package/plugins/src/base/skills/linear-to-tracker/SKILL.md +14 -0
  34. package/plugins/src/base/skills/notion-to-tracker/SKILL.md +14 -0
  35. package/plugins/src/base/skills/prd-backlink/SKILL.md +89 -0
@@ -0,0 +1,85 @@
1
+ ---
2
+ name: tracker-mining-specialist
3
+ description: "Tracker mining specialist for the Debrief flow. Walks every work item in a shipped initiative — description, comments, status transitions, child sub-tasks added during implementation, and bugs filed afterward referencing the item — and produces a structured findings list. Pairs with pr-mining-specialist (parallel) and feeds learnings-synthesizer."
4
+ skills:
5
+ - jira-read-ticket
6
+ - github-read-issue
7
+ - tracker-read
8
+ ---
9
+
10
+ # Tracker Mining Specialist Agent
11
+
12
+ You are a tracker mining specialist. Your job is to walk a closed initiative's tickets exhaustively and surface every signal that could become a learning, from the tracker side only. PR mining is owned by `pr-mining-specialist` running in parallel — do not duplicate that work.
13
+
14
+ ## Scope
15
+
16
+ You answer one question per work item: **What did the tracker record about this work that wasn't in the original spec?**
17
+
18
+ Adjacent questions other agents own:
19
+
20
+ | Question | Owner |
21
+ |----------|-------|
22
+ | What did PR review threads, late commits, and added tests reveal? | `pr-mining-specialist` |
23
+ | Across all tracker + PR findings, what is a candidate learning vs. noise? | `learnings-synthesizer` |
24
+ | Does the shipped work match the spec? | `spec-conformance-specialist` |
25
+
26
+ You are exhaustive, not selective. Surface the candidate; let the synthesizer judge.
27
+
28
+ ## Inputs
29
+
30
+ The team lead provides a list of `(work_item_key_or_id, tracker_type)` tuples. For each one, you walk the full ticket graph:
31
+
32
+ - The ticket itself: description, all fields, current status
33
+ - Every comment in chronological order, including agent-posted evidence comments and CodeRabbit summaries that landed on the ticket
34
+ - Status transitions and the duration spent in each status (long stalls are signals)
35
+ - Child sub-tasks — especially ones added *after* the original Plan run (those represent scope discovered during implementation)
36
+ - Issue links — `blocks`, `is blocked by`, `relates to`, `duplicates`, `clones` — and any new bug tickets filed *after* this one closed that reference it (regression signals)
37
+
38
+ Use the matching read skills (`jira-read-ticket` / `github-read-issue`) via `tracker-read`. Do not call MCP write tools.
39
+
40
+ ## Mining checklist (per work item)
41
+
42
+ Walk every item against this list. A finding is not "interesting" or "boring" — that judgment is the synthesizer's. You log every signal that matches a checklist row.
43
+
44
+ 1. **Description vs. final state divergence** — did the description list acceptance criteria that the comments reveal were silently changed? Note the original AC and what actually shipped.
45
+ 2. **Comments hinting at edge cases discovered during implementation** — phrases like "found that", "turns out", "edge case where", "we'll also need to handle", "broke when". Capture the comment author, timestamp, and quoted text.
46
+ 3. **Engineering decisions made in comments rather than the description** — these are convention drift candidates; the next agent reading a similar ticket has no way to find this decision.
47
+ 4. **Status stalls** — any status where the item sat longer than the median for its type (use a simple heuristic: > 3x the median duration of other items in this initiative for the same status). Long stalls usually indicate friction or an external dependency.
48
+ 5. **Sub-tasks added after the parent's Plan run** — every late-added sub-task is a scope-creep or missed-edge-case signal. Capture the sub-task summary and the parent's original AC.
49
+ 6. **Reopen / re-close cycles** — items that were closed and reopened indicate the original "done" was wrong. Capture each transition.
50
+ 7. **Bugs filed referencing this item after close** — search for issues that link back to this key with `relates to` / `duplicates` / `caused by` / cite it in their description. Each one is a candidate edge case the original spec missed.
51
+ 8. **CodeRabbit / bot summary content posted to the ticket** — bots often summarize PR review themes in a single comment. Pull those out verbatim.
52
+ 9. **Manual product / QA notes** — any comment that reports a manual test outcome ("tested in dev — works for case A, broke for case B") is gold; capture both cases.
53
+ 10. **Empty or thin acceptance criteria** that nonetheless shipped — itself a learning (process gap or rubber-stamping).
54
+
55
+ ## Output
56
+
57
+ Produce a single structured markdown report per work item, then aggregate across all items into a final report at the path the team lead provides. Per-item structure:
58
+
59
+ ```markdown
60
+ ## <work_item_key>: <summary>
61
+
62
+ - Status path: <status1> (<duration>) → <status2> (<duration>) → ...
63
+ - Linked PRs: <list>
64
+ - Sub-tasks added post-Plan: <list with original-vs-late timestamps>
65
+ - Reopen cycles: <count, with dates>
66
+ - Bugs filed afterward referencing this: <list of keys>
67
+
68
+ ### Findings
69
+
70
+ 1. <category from checklist row>: <one-line summary>
71
+ Evidence: <link to comment / transition / sub-task>
72
+ Quote (if applicable): "<verbatim>"
73
+ 2. ...
74
+ ```
75
+
76
+ If there are no findings under a checklist row, write `(none)` — silence is itself information for the synthesizer.
77
+
78
+ ## Rules
79
+
80
+ - **Never judge.** "Probably not interesting" is not a category. Every signal that matches a checklist row goes in.
81
+ - **Quote verbatim.** Paraphrasing comments loses author voice and the specifics that make a finding actionable.
82
+ - **Link, don't summarize.** Every finding has at least one evidence link to the source artifact (comment URL, ticket URL fragment, PR URL).
83
+ - **Run within the team.** Do not call `TeamCreate`. The Debrief skill created the team; you are a teammate.
84
+ - **Read-only.** Never call write MCP tools. You report; you do not mutate.
85
+ - **Parallel-safe.** You run alongside `pr-mining-specialist`; do not coordinate with them. The synthesizer reconciles.
@@ -0,0 +1,6 @@
1
+ ---
2
+ description: "Apply human-marked dispositions from a Debrief triage document — route accepted learnings to their persistence destinations (Edge Case Brainstorm checklist, project rules, memory, tracker tickets). Reads the triage doc produced by /lisa:debrief; deterministic and idempotent."
3
+ argument-hint: "<path to triage doc | URL>"
4
+ ---
5
+
6
+ Use the /lisa:debrief:apply command (which invokes the `lisa:debrief-apply` skill) to read the triage document at $ARGUMENTS, parse human dispositions, and persist accepted learnings to their categorized destinations.
@@ -0,0 +1,6 @@
1
+ ---
2
+ description: "Debrief a shipped initiative — mine tickets, PRs, and review threads to surface candidate learnings (edge cases, gotchas, friction, tooling gaps, convention drift) for human triage. Stops after producing the triage doc; persistence happens in /lisa:debrief:apply."
3
+ argument-hint: "<PRD URL | epic key | epic URL>"
4
+ ---
5
+
6
+ Use the /lisa:debrief skill to walk the original Plan, mine completed work units and their PRs, and produce a triage-ready learnings document for $ARGUMENTS.
@@ -2,7 +2,7 @@
2
2
  # Enforces team-first orchestration for lifecycle skills.
3
3
  #
4
4
  # Triggered on four hook events:
5
- # - UserPromptSubmit : detects /lisa:research|plan|implement|intake in the
5
+ # - UserPromptSubmit : detects /lisa:research|plan|implement|intake|debrief in the
6
6
  # raw prompt and arms enforcement for the session
7
7
  # - PreToolUse : detects the same skills via a `Skill` tool call,
8
8
  # arms enforcement, and blocks bypass tool calls
@@ -46,7 +46,7 @@ find "$STATE_DIR" -maxdepth 1 -type f -mmin +1440 -delete 2>/dev/null || true
46
46
 
47
47
  is_lifecycle_skill() {
48
48
  case "$1" in
49
- lisa:research|lisa:plan|lisa:implement|lisa:intake) return 0 ;;
49
+ lisa:research|lisa:plan|lisa:implement|lisa:intake|lisa:debrief) return 0 ;;
50
50
  *) return 1 ;;
51
51
  esac
52
52
  }
@@ -63,7 +63,13 @@ case "$HOOK_EVENT" in
63
63
  # Match a slash command at the start of the prompt (allow optional whitespace).
64
64
  LEADING=$(printf '%s' "$PROMPT" | sed -n '1p' | sed -E 's/^[[:space:]]*//')
65
65
  case "$LEADING" in
66
- /lisa:research*|/lisa:plan*|/lisa:implement*|/lisa:intake*)
66
+ # /lisa:debrief:apply is single-agent — explicitly excluded by listing
67
+ # it first with a no-op pattern; the broader /lisa:debrief* below would
68
+ # otherwise capture it.
69
+ /lisa:debrief:apply*)
70
+ : # single-agent, no team enforcement
71
+ ;;
72
+ /lisa:research*|/lisa:plan*|/lisa:implement*|/lisa:intake*|/lisa:debrief*)
67
73
  # Strip leading slash and any args after the first whitespace.
68
74
  SKILL_NAME=$(printf '%s' "$LEADING" | sed -E 's|^/||; s/[[:space:]].*$//')
69
75
  printf '%s\n' "$SKILL_NAME" >"$SKILL_FLAG" 2>/dev/null || true
@@ -11,7 +11,7 @@ This protocol runs **once per session**, on the first user message. After that,
11
11
  1. If the user invoked a slash command (`/lisa:research`, `/lisa:plan`, `/lisa:implement`, `/lisa:verify`, `/lisa:monitor`, `/lisa:intake`, etc.), the flow is already determined -- skip classification.
12
12
  2. Read the user's request and match it against the flow definitions below.
13
13
  3. If you cannot confidently classify the request:
14
- - **Interactive session** (user is present): present a multiple choice using AskUserQuestion with options: Research, Plan, Implement, Verify, No flow.
14
+ - **Interactive session** (user is present): present a multiple choice using AskUserQuestion with options: Research, Plan, Implement, Verify, Debrief, No flow.
15
15
  - **Headless/non-interactive session** (running with `-p` flag, in a CI pipeline, or as a scheduled agent): do NOT ask the user. Classify to the best of your ability from available context (ticket content, prompt text, current branch state). If you truly cannot classify, default to "No flow" and proceed with the request as-is.
16
16
  4. Once a flow is selected, **echo it back explicitly** before doing anything else. State the flow, the work type (if applicable), and a one-sentence justification for why this flow was chosen. Example:
17
17
 
@@ -34,7 +34,7 @@ What this rule still enforces:
34
34
 
35
35
  2. **Cascade rule (load-bearing)**: Before calling `TeamCreate`, check whether you are already operating inside an agent team. Signs you are inside a team: a prior `TeamCreate` exists in this session; you were spawned via `Agent` with `team_name`; your context references a team lead. If any of these are true, **do NOT call `TeamCreate`** — the harness rejects double-creates and the work stalls. Continue within the existing team. Invoke flows via the Skill tool; the team lead inherits responsibility for orchestration.
36
36
 
37
- 3. **Default mode**: `Research`, `Plan`, `Implement`, and `Intake` run as agent teams. The `Implement` flow — including every work type (`Build`, `Fix`, `Improve`, `Investigate-Only`) — is **always** a team flow. Bug fixes that "look simple" are not an exception: the Reproduce sub-flow, debug-specialist, bug-fixer, parallel reviewers, and verification-specialist all need to compose. `Verify` (standalone) and `Monitor` (standalone) use the One-shot Sub-agents pattern (see `## Orchestration` below) — these flows are linear with no parallelism and the team overhead is not warranted. Single-agent mode is otherwise reserved for: `product-walkthrough` invoked standalone (not as part of Research/Plan), and one-off diagnostic Bash/Read sessions that don't invoke any lifecycle skill. When in doubt, use a team.
37
+ 3. **Default mode**: `Research`, `Plan`, `Implement`, `Intake`, and `Debrief` run as agent teams. The `Implement` flow — including every work type (`Build`, `Fix`, `Improve`, `Investigate-Only`) — is **always** a team flow. Bug fixes that "look simple" are not an exception: the Reproduce sub-flow, debug-specialist, bug-fixer, parallel reviewers, and verification-specialist all need to compose. `Debrief` runs as a team because tracker-mining and pr-mining parallelize cleanly and synthesis gates on both completing. `Verify` (standalone) and `Monitor` (standalone) use the One-shot Sub-agents pattern (see `## Orchestration` below) — these flows are linear with no parallelism and the team overhead is not warranted. Single-agent mode is otherwise reserved for: `product-walkthrough` invoked standalone (not as part of Research/Plan), `debrief-apply` (deterministic routing of human-marked dispositions), and one-off diagnostic Bash/Read sessions that don't invoke any lifecycle skill. When in doubt, use a team.
38
38
 
39
39
  The mechanical TeamCreate bootstrap directive lives inside each lifecycle skill — see those skills' orchestration preambles for the exact wording: first `ToolSearch{select:TeamCreate}` (load deferred schema), then `TeamCreate`.
40
40
 
@@ -65,10 +65,11 @@ Gate:
65
65
  Sequence:
66
66
  1. **Investigate sub-flow** -- gather context from codebase, git history, existing behavior, and external sources
67
67
  2. `product-specialist` -- define user goals, user flows (Gherkin), acceptance criteria, error states, UX concerns, and out-of-scope items
68
- 3. `architecture-specialist` -- assess technical feasibility, identify constraints, map existing system boundaries
69
- 4. Synthesize findings into a PRD document containing: problem statement, user stories, acceptance criteria, technical constraints, open questions, and proposed scope
70
- 5. **Plan Phase Tooling** -- review all available skills and agents (project-defined, plugin-provided, and built-in) and determine which ones the Plan phase will need. For each recommended skill or agent, state why it is needed. If no skills or agents beyond the defaults are identified, explicitly justify why the standard set is sufficient. Include this as a "Recommended Tooling for Plan Phase" section in the PRD.
71
- 6. `learner` -- capture discoveries for future sessions
68
+ 3. **Edge Case Brainstorm sub-flow** -- run the PRD candidate through the edge-case checklist; fold accepted cases into acceptance criteria, out-of-scope, or open questions
69
+ 4. `architecture-specialist` -- assess technical feasibility, identify constraints, map existing system boundaries
70
+ 5. Synthesize findings into a PRD document containing: problem statement, user stories, acceptance criteria, technical constraints, open questions, and proposed scope
71
+ 6. **Plan Phase Tooling** -- review all available skills and agents (project-defined, plugin-provided, and built-in) and determine which ones the Plan phase will need. For each recommended skill or agent, state why it is needed. If no skills or agents beyond the defaults are identified, explicitly justify why the standard set is sufficient. Include this as a "Recommended Tooling for Plan Phase" section in the PRD.
72
+ 7. `learner` -- capture discoveries for future sessions
72
73
 
73
74
  Output: A PRD document that includes a "Recommended Tooling for Plan Phase" section listing the skills and agents the Plan phase should use. If there is not enough context to produce a complete PRD, stop and report what is missing rather than producing an incomplete one.
74
75
 
@@ -84,19 +85,21 @@ Gate:
84
85
 
85
86
  Sequence:
86
87
  1. **Investigate sub-flow** -- explore codebase for architecture, patterns, dependencies relevant to the spec
87
- 2. `product-specialist` -- validate and refine acceptance criteria for the whole scope
88
- 3. `architecture-specialist` -- map dependencies, identify cross-cutting concerns, determine execution order
89
- 4. **Implement/Verify Phase Tooling** -- review all available skills and agents (project-defined, plugin-provided, and built-in) and determine which ones the Implement and Verify phases will need for each work item. For each recommended skill or agent, state why it is needed and which work items it applies to. If no skills or agents beyond the defaults are identified for a work item, explicitly justify why the standard set is sufficient.
90
- 5. Decompose into ordered work items (epics, stories, tasks, spikes, bugs), each with:
88
+ 2. `product-specialist` -- validate and refine acceptance criteria for the whole scope, including error states and UX concerns
89
+ 3. **Edge Case Brainstorm sub-flow** -- run the PRD as a whole through the checklist to catch scope-shaped gaps before decomposition
90
+ 4. `architecture-specialist` -- map dependencies, identify cross-cutting concerns, determine execution order
91
+ 5. **Implement/Verify Phase Tooling** -- review all available skills and agents (project-defined, plugin-provided, and built-in) and determine which ones the Implement and Verify phases will need for each work item. For each recommended skill or agent, state why it is needed and which work items it applies to. If no skills or agents beyond the defaults are identified for a work item, explicitly justify why the standard set is sufficient.
92
+ 6. Decompose into ordered work items (epics, stories, tasks, spikes, bugs). For each item, run the **Edge Case Brainstorm sub-flow** scoped to that item — accepted cases become additional acceptance criteria or sub-tasks; rejected ones are noted with a one-line reason. Each item carries:
91
93
  - Type (epic, story, task, spike, bug)
92
- - Acceptance criteria
94
+ - Acceptance criteria (including any added by the per-item brainstorm)
93
95
  - Verification method
94
96
  - Dependencies
95
- - Skills and agents required (from step 4)
96
- 6. Create work items in the tracker (JIRA, Linear, GitHub) with acceptance criteria, dependencies, and recommended skills/agents
97
- 7. `learner` -- capture discoveries for future sessions
97
+ - Skills and agents required (from step 5)
98
+ 7. Create work items in the tracker (JIRA, Linear, GitHub) with acceptance criteria, dependencies, and recommended skills/agents
99
+ 8. **PRD back-link** -- update the source PRD with a `## Tickets` section listing every created work item (key, title, type, link), so the PRD becomes the canonical anchor for downstream flows (notably **Debrief**). Invoke `lisa:prd-backlink` with the PRD source and the created ticket list. The section is regenerated on each run, not appended, so re-planning never produces stale links.
100
+ 9. `learner` -- capture discoveries for future sessions
98
101
 
99
- Output: Work items in a tracker with acceptance criteria and recommended skills/agents, ordered by dependency. If the specification cannot be decomposed without further clarification, stop and report what is missing.
102
+ Output: Work items in a tracker with acceptance criteria and recommended skills/agents, ordered by dependency. The source PRD carries a `## Tickets` section linking back to every created item. If the specification cannot be decomposed without further clarification, stop and report what is missing.
100
103
 
101
104
  ### Implement
102
105
 
@@ -189,6 +192,32 @@ Sequence:
189
192
 
190
193
  Output: Merged PR, successful deploy, remote verification passing.
191
194
 
195
+ ### Debrief
196
+
197
+ When: An initiative is fully shipped — every work item from the original Plan is in a terminal state and its PR is merged. The user wants to surface candidate learnings (edge cases, gotchas, friction, tooling gaps, convention drift) for human triage so future agents inherit what this initiative taught.
198
+
199
+ Gate:
200
+ - A PRD or epic must be provided as input — the PRD URL (Notion / Confluence / Linear / GitHub Issue / file), the epic key (JIRA), or the epic issue URL (GitHub). The PRD's `## Tickets` section (written by Plan step 8) is the canonical anchor for the work-item set; an epic's children are the equivalent.
201
+ - Every work item linked from the input must be in a terminal state (Done / Closed / Cancelled). If any item is still open, stop and list the unfinished items.
202
+ - Every Done item that was implementable must have at least one merged PR linked. If a Done item has no PR, surface it as a debrief anomaly rather than silently excluding it.
203
+ - Headless / non-interactive sessions: do not block on missing input — if the input is ambiguous (e.g., only a vague initiative name), fail with a clear error listing what was needed.
204
+
205
+ Sequence:
206
+ 1. **Resolve the work-item set** — read the input. If it's a PRD, follow its `## Tickets` section. If it's an epic, list its children. Build the canonical list of `(work_item, linked_PRs[])` tuples. If a work item has no `linked_PRs` and is not a spike, mark it as an anomaly to surface in step 4.
207
+ 2. **Mine in parallel** (run as concurrent tasks within the team):
208
+ - `tracker-mining-specialist` — for every work item, walk the description, every comment (human, agent evidence, CodeRabbit summary), status transitions and their durations, late-arriving bugs that reference the item, and child sub-tasks added during implementation. Output a structured per-ticket findings list.
209
+ - `pr-mining-specialist` — for every linked PR, walk the description, every review comment (general + inline; CodeRabbit + human), every commit on the branch (especially late `fix:` / `revert:` / follow-up commits), and every test file added. Output a structured per-PR findings list.
210
+ 3. `learnings-synthesizer` — consume both findings lists, deduplicate, and categorize each candidate learning into one of:
211
+ - **Edge case** — a failure mode that should have been caught at PRD/Plan time; candidate addition to the Edge Case Brainstorm checklist
212
+ - **Recurring gotcha** — a stack- or codebase-specific trap (e.g., "this ORM silently truncates X")
213
+ - **Process friction** — a step in the lifecycle that consistently slowed the work
214
+ - **Tooling gap** — missing skill, wrong agent assignment, broken hook, missing automation
215
+ - **Convention drift** — an unwritten rule revealed by review comments that should be codified
216
+ 4. **Produce the human-triage document** — a markdown file with one row per candidate learning showing: category, summary, evidence (links to the source ticket comment / PR comment / commit), recommended persistence destination, and a checkbox-style disposition field the human will mark (Accept / Reject / Defer). Surface step-1 anomalies (work items missing PRs, etc.) in a separate section. The document is exhaustive — it lists every candidate, even ones the synthesizer rates low confidence — because the human, not the agent, decides what is worth keeping.
217
+ 5. **Stop and hand the document to the human.** Debrief does NOT persist accepted learnings itself. The human triages, marks dispositions, and runs the **`/lisa:debrief:apply`** command (skill: `debrief-apply`) to route the accepted items to their destinations.
218
+
219
+ Output: A triage-ready learnings document covering every work item and PR in the initiative, with structured evidence and disposition fields. Persistence is deferred to `debrief-apply`, which the human invokes after triage.
220
+
192
221
  ## Sub-flows
193
222
 
194
223
  Sub-flows are reusable sequences invoked by main flows. When a flow says "Investigate sub-flow", execute the full Investigate sequence.
@@ -203,6 +232,54 @@ Sequence:
203
232
  3. `ops-specialist` -- check logs, errors, health (if runtime issue)
204
233
  4. Report findings with evidence
205
234
 
235
+ ### Edge Case Brainstorm
236
+
237
+ Purpose: Force explicit consideration of edge cases at PRD time and at work-item time, so failure modes that change scope or add acceptance criteria are caught before implementation rather than after a bug is filed in production.
238
+
239
+ Invoked by: Research (against the PRD as a whole), Plan (once against the PRD before decomposition, then once per work item during decomposition), and Build / Fix sub-flows when a `product-specialist` or `test-specialist` step would otherwise rubber-stamp acceptance criteria.
240
+
241
+ Sequence:
242
+ 1. Walk through the checklist below and propose every candidate edge case that plausibly applies to the scope under review. Aim for breadth, not pre-filtered relevance — propose first, judge second.
243
+ 2. For each candidate, take an explicit action and record it:
244
+ - **Accept** — fold into acceptance criteria (PRD-level or work-item level), or open a new work item / sub-task if the case is large enough to warrant one
245
+ - **Defer** — capture as an open question or `Out of Scope` line with a one-sentence reason
246
+ - **Reject** — note the case and a one-sentence reason it does not apply (e.g., "single-tenant, no concurrent edits possible")
247
+ 3. A silent skip is not allowed — every candidate from the checklist must end up Accepted, Deferred, or Rejected with a reason. "Considered edge cases" without a per-item disposition does not satisfy this sub-flow.
248
+ 4. If three or more candidates are Accepted at PRD time, treat that as a signal that the PRD scope is wider than originally framed and call it out in the synthesis step.
249
+
250
+ Checklist (pattern + question form — ask each question literally of the scope under review):
251
+
252
+ **Navigation & URL state**
253
+ - *Reload persistence*: if the user reloads mid-task, do they land where they were — same tab, same filters, same scroll, same selection — or get bounced to a default?
254
+ - *Deep linking*: can the URL alone reconstruct the screen, or does it require state from a previous click?
255
+ - *Back / forward*: does browser history match what the user expects, or does it skip steps or re-trigger side effects?
256
+ - *Parameter change then reload*: after the user changes filters / sort / tab / pagination, does a reload preserve those choices?
257
+
258
+ **Data lifecycle**
259
+ - *Empty state*: what does this look like the very first time, with zero data?
260
+ - *Single vs. many*: does the UI degrade with 1 item, 10k items, or at pagination boundaries?
261
+ - *Stale data*: if the user leaves the tab open for an hour, what is wrong when they come back?
262
+ - *Concurrent edits*: two users (or two tabs) editing the same record — last-write-wins, conflict, or merge?
263
+ - *Deletion mid-flow*: the resource the user is viewing gets deleted by someone else while they have it open.
264
+
265
+ **Failure modes**
266
+ - *Network*: offline, slow, intermittent, request mid-flight when the user navigates away.
267
+ - *Partial success*: bulk action where 8 of 10 succeed — what does the user see and what state is the system in?
268
+ - *Permission denied mid-flow*: token expires, role changes, resource becomes inaccessible.
269
+ - *Idempotency*: double-click submit, retry after timeout — does the action happen twice?
270
+
271
+ **Input boundaries**
272
+ - *Text*: empty, max-length, unicode, whitespace-only, leading / trailing whitespace, emoji, RTL.
273
+ - *Numeric*: zero, negative, very large, non-integer, floating-point precision.
274
+ - *Date / time*: timezone, DST transition, leap day, "now" vs. server time skew.
275
+
276
+ **Auth & session**
277
+ - *Session expiry mid-action*: what happens to in-flight work?
278
+ - *Role downgrade*: the user loses access to the screen they are currently on.
279
+ - *Multi-tab session*: logout in one tab while another tab is mid-action.
280
+
281
+ This list is non-exhaustive — agents should propose additional edge cases relevant to the domain (e.g., real-time / streaming, money / financial rounding, regulated data, multi-tenant isolation) and run them through the same Accept / Defer / Reject discipline.
282
+
206
283
  ### Reproduce
207
284
 
208
285
  Purpose: Create a reliable reproduction that demonstrates a bug before fixing it.
@@ -267,11 +344,13 @@ Vendor-neutral callers (e.g., `implement`, `verify`) should invoke the `tracker-
267
344
 
268
345
  Flows can chain naturally:
269
346
  - Research produces a PRD -- hand it to Plan
270
- - Plan produces work items -- hand each to Implement
347
+ - Plan produces work items (and writes a `## Tickets` back-link section into the PRD) -- hand each item to Implement
271
348
  - Implement produces verified code -- hand to Verify
349
+ - Verify ships and confirms the deploy -- once every work item in the PRD is shipped, hand the PRD (or the epic) to Debrief
350
+ - Debrief produces a triage-ready learnings document -- hand to the human, who marks dispositions and runs `debrief-apply` to persist accepted learnings
272
351
  - If any flow discovers it lacks what it needs, it stops and suggests the preceding flow
273
352
 
274
- The full lifecycle for a large initiative: Research -> Plan -> Implement (per item) -> Verify (per item).
353
+ The full lifecycle for a large initiative: Research -> Plan -> Implement (per item) -> Verify (per item) -> Debrief (once across the whole initiative) -> Debrief Apply (human-triggered, after triage).
275
354
 
276
355
  ## Sub-flow Usage
277
356
 
@@ -290,6 +369,7 @@ Use an **agent team** (TeamCreate + TaskCreate per step) for:
290
369
  - **Implement** (Build, Fix, Improve) — long sequences with parallel review and a real risk of compaction
291
370
  - **Plan** — multiple specialists feeding a shared decomposition
292
371
  - **Research** — multiple specialists feeding a shared PRD
372
+ - **Debrief** — tracker-mining and pr-mining run in parallel and gate the synthesizer; the work-item set can be large, so durable task state matters
293
373
  - Any flow that invokes the **Review sub-flow** (the four review specialists run in parallel and gate a single follow-up task)
294
374
 
295
375
  Why: these flows have enough steps that context compaction is likely; the Review sub-flow is parallel-by-design and `blockedBy` expresses that cleanly; durable task state lets the team lead recover assignments after compaction.
@@ -259,6 +259,20 @@ After all tickets are created, present a summary table to the user:
259
259
  - Blockers list with recommendations and alternatives
260
260
  - Cross-PRD dependencies
261
261
 
262
+ ### Phase 7: PRD Back-link
263
+
264
+ > **Mode guard**: In `dry_run: true` mode, skip this phase entirely — no tickets exist to link.
265
+
266
+ After Phase 6, invoke the `lisa:prd-backlink` skill to write a `## Tickets` section back into the source Confluence PRD page. The section becomes the canonical anchor for the **Debrief** flow once the initiative ships.
267
+
268
+ Invoke `lisa:prd-backlink` with:
269
+
270
+ - `source_type: "confluence"`
271
+ - `source_ref`: the original Confluence page URL
272
+ - `tickets`: the full list created in Phases 3–5, each entry as `{ key, title, type, url, parent_key }`
273
+
274
+ If `lisa:prd-backlink` fails (page permission denied, Confluence unreachable), surface the error in the Phase 6 report rather than aborting — the tickets are already created. Recommend the user re-run `lisa:prd-backlink` standalone once the source is reachable.
275
+
262
276
  ## Handling Ambiguities and Blockers
263
277
 
264
278
  When you encounter something the PRD + comments + codebase can't resolve:
@@ -0,0 +1,79 @@
1
+ ---
2
+ name: debrief
3
+ description: "Run the Debrief flow over a shipped initiative. Input: a PRD URL (Notion / Confluence / Linear / GitHub Issue / file), a JIRA epic key, or a GitHub epic issue URL. Output: a triage-ready learnings document covering every work item in the initiative — edge cases, gotchas, process friction, tooling gaps, convention drift — each with structured evidence and a human-disposition field. Persistence is deferred to lisa:debrief-apply."
4
+ allowed-tools: ["Skill", "ToolSearch", "TeamCreate", "Bash", "Read", "Glob", "Grep"]
5
+ ---
6
+
7
+ # Debrief: $ARGUMENTS
8
+
9
+ Walk the original Plan for `$ARGUMENTS`, mine the completed work items and their PRs, and produce a triage-ready learnings document for human review.
10
+
11
+ ## Orchestration: agent team
12
+
13
+ If you are NOT already operating inside an agent team (no prior `TeamCreate` in this session, not spawned via `Agent` with `team_name`), the very first thing you do is create the team. Two tool calls only, in this exact order:
14
+
15
+ 1. `ToolSearch` with `query: "select:TeamCreate"` — `TeamCreate` is a deferred tool whose schema must be loaded before it can be invoked. A cold call returns `InputValidationError` and tempts a fallback to direct `Agent` calls, which bypasses the team.
16
+ 2. `TeamCreate` — actually create the team.
17
+
18
+ Until `TeamCreate` returns successfully, do NOT call any of: `Agent`, `TaskCreate`, `Skill`, MCP tools (Atlassian / Linear / GitHub / Notion), `Read`, `Write`, `Edit`, `Bash`, `Grep`, `Glob`. Resolving the work-item set, fetching tickets, walking PRs — all of those are tasks for the team you are about to create, not for the lead session before the team exists.
19
+
20
+ If you ARE already inside an agent team (e.g., a teammate invoked this skill via the Skill tool), do NOT call `TeamCreate` — the harness rejects double-creates. Continue within the existing team.
21
+
22
+ ## Input
23
+
24
+ `$ARGUMENTS` is one of:
25
+
26
+ | Input shape | Resolution |
27
+ |-------------|------------|
28
+ | Notion / Confluence / Linear / GitHub Issue PRD URL | Fetch the PRD; read its `## Tickets` (or equivalent) back-link section written by the Plan flow |
29
+ | File path to a PRD markdown | Read the file; parse its `## Tickets` section |
30
+ | JIRA epic key (e.g. `SE-1234`) or epic URL | Fetch the epic; list its child issues (Stories, Tasks, Bugs) |
31
+ | GitHub epic issue URL or `<org>/<repo>#<n>` | Fetch the epic issue; list its sub-issues / linked items |
32
+
33
+ If the PRD has no `## Tickets` section AND the input is not an epic, stop and report — the Plan flow's PRD back-link step (`lisa:prd-backlink`) was likely skipped. Suggest re-running Plan to populate the section, or pass the epic key directly.
34
+
35
+ ## Gate
36
+
37
+ Run before mining begins:
38
+
39
+ 1. **All work items terminal.** Every linked work item must be in a terminal state (Done / Closed / Cancelled equivalent for the tracker). If any item is still open, stop and list the unfinished items — Debrief is post-shipping by definition.
40
+ 2. **PR coverage.** Every Done item that was implementable (Story / Task / Bug; not Spike) must have at least one merged PR linked. Items missing a PR are recorded as **anomalies** to surface in the report rather than silently excluded — a Done item with no PR is itself a learning ("how did this ship?").
41
+ 3. **Headless safety.** In headless / `-p` / scheduled mode, do not block on missing input — fail fast with a clear error listing what was needed.
42
+
43
+ ## Flow
44
+
45
+ Execute the **Debrief** flow as defined in the `intent-routing` rule (loaded via the lisa plugin). The rule contains the canonical step sequence (gate, mining, synthesis, output, hand-off). This skill does NOT restate flow steps — change them in the rule, propagate everywhere.
46
+
47
+ The flow's mining step runs `tracker-mining-specialist` and `pr-mining-specialist` in parallel as separate tasks within the team. Both must complete before `learnings-synthesizer` runs. Express this with `blockedBy` so the synthesizer task is automatically gated on the two mining tasks.
48
+
49
+ ## Exhaustiveness expectation
50
+
51
+ Debrief is deliberately exhaustive — the human, not the agent, decides what is worth keeping. Specialists should err toward surfacing more candidates, not fewer. A candidate that the synthesizer rates low confidence is still a row in the triage doc; only outright duplicates are dropped.
52
+
53
+ ## Output
54
+
55
+ A markdown triage document at `./debrief/<initiative-slug>-<YYYY-MM-DD>.md` (or wherever the project's debrief output directory is configured) containing:
56
+
57
+ 1. **Header** — initiative name, source PRD/epic link, work-item count, PR count, generation date, gate results.
58
+ 2. **Anomalies** — work items missing PRs, items with abnormal status-transition timing, PRs with no review comments at all (signal-of-absence is a learning), etc.
59
+ 3. **Candidate learnings** — one row per candidate, grouped by category (Edge case / Recurring gotcha / Process friction / Tooling gap / Convention drift). Each row has:
60
+ - `Summary` — one sentence
61
+ - `Category`
62
+ - `Evidence` — links to the source ticket comment / PR comment / commit / test file (multiple allowed)
63
+ - `Recommended persistence destination` — the agent's best guess for where this should land if accepted (e.g., "Edge Case Brainstorm checklist → Navigation & URL state", "PROJECT_RULES.md", "memory: project_*.md", "new tooling-gap ticket")
64
+ - `Disposition` — empty checkbox-style field the human will fill: `[ ] Accept` / `[ ] Reject` / `[ ] Defer` plus a free-text reason
65
+ 4. **Source map** — appendix listing every work item and PR walked, so the human can verify completeness.
66
+
67
+ The skill's terminal output is the path to the triage document and a one-line summary of counts per category. Persistence does not happen here — that is `lisa:debrief-apply`'s job.
68
+
69
+ ## Hand-off
70
+
71
+ After producing the triage document, print:
72
+
73
+ ```text
74
+ Triage document written to: <path>
75
+ Counts: <n> edge cases, <n> gotchas, <n> friction, <n> tooling gaps, <n> convention drift; <n> anomalies
76
+ Next: human triage. When done, run `/lisa:debrief:apply <path>` to persist accepted learnings.
77
+ ```
78
+
79
+ Then stop. Debrief never persists learnings on its own.
@@ -0,0 +1,63 @@
1
+ ---
2
+ name: debrief-apply
3
+ description: "Apply human-marked dispositions from a Debrief triage document. Reads the triage doc produced by lisa:debrief, parses each row's disposition (Accept / Reject / Defer), and routes Accepted items to their persistence destination. Deterministic and idempotent — safe to re-run if dispositions are added incrementally."
4
+ allowed-tools: ["Skill", "Bash", "Read", "Edit", "Write", "Glob", "Grep"]
5
+ ---
6
+
7
+ # Debrief Apply: $ARGUMENTS
8
+
9
+ Read the triage document at `$ARGUMENTS` and persist every Accepted candidate learning to its destination.
10
+
11
+ This skill is intentionally **single-agent** — there is no team. Routing is deterministic given the disposition column. Spawning sub-agents would only add latency.
12
+
13
+ ## Input
14
+
15
+ A path or URL to a Debrief triage document produced by `lisa:debrief`. The document is expected to follow the structure that skill produces — a header, an anomalies section, candidate-learning rows grouped by category, and a source-map appendix.
16
+
17
+ ## Pre-flight
18
+
19
+ 1. **Verify the doc exists and parses.** If the file cannot be read or the expected sections are missing, stop and report — do not guess.
20
+ 2. **Confirm dispositions exist.** If every row is unmarked, stop and ask the human to triage first. A pristine doc is a no-op, not an error to silently swallow.
21
+ 3. **Identify the destination map.** Read the project's `.lisa.config.json` (or stack defaults) for: edge-case checklist file (default: `plugins/src/base/rules/intent-routing.md`'s Edge Case Brainstorm sub-flow), project-rules file (default: `.claude/rules/PROJECT_RULES.md`), memory directory (per the auto-memory system path), tracker for new tickets.
22
+
23
+ ## Routing rules
24
+
25
+ For every row marked **Accept**:
26
+
27
+ | Category | Destination | Action |
28
+ |----------|-------------|--------|
29
+ | Edge case | Edge Case Brainstorm checklist in `intent-routing.md` | Append the new pattern + question to the matching group (Navigation, Data, Failure, Input, Auth, or a new group if none fit). Use the row's `Summary` and `Evidence` link as a citation comment. |
30
+ | Recurring gotcha | Memory file (`project_*.md`) | Write a new memory entry with `type: project`, structured as: rule, **Why:**, **How to apply:**. Add an index line to `MEMORY.md`. |
31
+ | Process friction | Project rules file | Append a one-line guideline to `PROJECT_RULES.md` under an appropriate heading (or create one). |
32
+ | Tooling gap | Configured tracker | Create a new ticket via `lisa:tracker-write` with `issue_type: Task`, summary derived from the row's `Summary`, description citing the evidence and the originating debrief doc. Label appropriately (`type:tooling`, `lifecycle-improvement`, etc.). |
33
+ | Convention drift | `CLAUDE.md` (project) or `PROJECT_RULES.md` | Append the convention as a one-paragraph note under the relevant section. If no relevant section exists, create one. |
34
+
35
+ For every row marked **Reject** or **Defer**: no action. Defer is a no-op for `apply` but worth surfacing in the run summary — the human may want to revisit at the next debrief.
36
+
37
+ ## Idempotency
38
+
39
+ `apply` is safe to re-run. Each Accepted row carries an evidence link that doubles as a fingerprint — before writing, check whether the destination already cites that fingerprint. If it does, skip the write and note the row as `already-applied` in the run summary. This lets the human triage a doc incrementally (mark a few, run apply, mark more, run apply again) without producing duplicates.
40
+
41
+ ## Updating the triage doc
42
+
43
+ After each Accepted row is persisted, replace its `[ ] Accept` checkbox with `[x] Applied — <one-line summary of what was written>`. This makes the triage doc itself the audit log of what was acted on. If a write fails (e.g., tracker is unreachable), mark the row `[!] Apply failed — <reason>` and continue with the rest. Never abort the whole run because one row failed.
44
+
45
+ ## Output
46
+
47
+ A run summary printed to the user:
48
+
49
+ ```text
50
+ Applied <n> learnings:
51
+ <n> edge cases → intent-routing.md
52
+ <n> gotchas → memory
53
+ <n> friction → PROJECT_RULES.md
54
+ <n> tooling gaps → <tracker> (<key1>, <key2>, ...)
55
+ <n> convention drift → CLAUDE.md
56
+ Skipped:
57
+ <n> rejected, <n> deferred, <n> already-applied
58
+ Failed:
59
+ <n> (see <path> for details)
60
+ Triage doc updated in place: <path>
61
+ ```
62
+
63
+ If anything is written to a tracker, suggest the human commit the local file changes (memory, rules, intent-routing) when ready — `apply` does not commit.
@@ -252,6 +252,20 @@ After all tickets are created, present a summary table to the user:
252
252
  - Blockers list with recommendations and alternatives
253
253
  - Cross-PRD dependencies
254
254
 
255
+ ### Phase 7: PRD Back-link
256
+
257
+ > **Mode guard**: In `dry_run: true` mode, skip this phase entirely — no tickets exist to link.
258
+
259
+ After Phase 6, invoke the `lisa:prd-backlink` skill to write a `## Tickets` section back into the source GitHub Issue PRD body. The section becomes the canonical anchor for the **Debrief** flow once the initiative ships.
260
+
261
+ Invoke `lisa:prd-backlink` with:
262
+
263
+ - `source_type: "github"`
264
+ - `source_ref`: the original GitHub Issue URL or `<org>/<repo>#<n>` token
265
+ - `tickets`: the full list created in Phases 3–5, each entry as `{ key, title, type, url, parent_key }`
266
+
267
+ If `lisa:prd-backlink` fails (permission denied, GitHub unreachable, issue locked), surface the error in the Phase 6 report rather than aborting — the tickets are already created. Recommend the user re-run `lisa:prd-backlink` standalone once the source is reachable.
268
+
255
269
  ## Handling Ambiguities and Blockers
256
270
 
257
271
  When you encounter something the PRD + comments + codebase can't resolve:
@@ -252,6 +252,20 @@ After all tickets are created, present a summary table to the user:
252
252
  - Blockers list with recommendations and alternatives
253
253
  - Cross-PRD dependencies
254
254
 
255
+ ### Phase 7: PRD Back-link
256
+
257
+ > **Mode guard**: In `dry_run: true` mode, skip this phase entirely — no tickets exist to link.
258
+
259
+ After Phase 6, invoke the `lisa:prd-backlink` skill to write a `## Tickets` section back into the source Linear project (or its description). The section becomes the canonical anchor for the **Debrief** flow once the initiative ships.
260
+
261
+ Invoke `lisa:prd-backlink` with:
262
+
263
+ - `source_type: "linear"`
264
+ - `source_ref`: the original Linear project URL
265
+ - `tickets`: the full list created in Phases 3–5, each entry as `{ key, title, type, url, parent_key }`
266
+
267
+ If `lisa:prd-backlink` fails (permission denied, Linear unreachable), surface the error in the Phase 6 report rather than aborting — the tickets are already created. Recommend the user re-run `lisa:prd-backlink` standalone once the source is reachable.
268
+
255
269
  ## Handling Ambiguities and Blockers
256
270
 
257
271
  When you encounter something the PRD + comments + codebase can't resolve:
@@ -264,6 +264,20 @@ After all tickets are created, present a summary table to the user:
264
264
  - Blockers list with recommendations and alternatives
265
265
  - Cross-PRD dependencies
266
266
 
267
+ ### Phase 7: PRD Back-link
268
+
269
+ > **Mode guard**: In `dry_run: true` mode, skip this phase entirely — no tickets exist to link.
270
+
271
+ After Phase 6, invoke the `lisa:prd-backlink` skill to write a `## Tickets` section back into the source PRD. The section becomes the canonical anchor for the **Debrief** flow once the initiative ships, and gives any human reading the PRD months later a one-click path to every work item created from it.
272
+
273
+ Invoke `lisa:prd-backlink` with:
274
+
275
+ - `source_type: "notion"`
276
+ - `source_ref`: the original PRD URL
277
+ - `tickets`: the full list created in Phases 3–5, each entry as `{ key, title, type, url, parent_key }`
278
+
279
+ If `lisa:prd-backlink` fails (PRD permission denied, Notion unreachable, source mutated mid-run), surface the error in the Phase 6 report rather than aborting — the tickets are already created and their value to the team is not blocked by the back-link write. Recommend the user re-run `lisa:prd-backlink` standalone once the source is reachable again.
280
+
267
281
  ## Handling Ambiguities and Blockers
268
282
 
269
283
  When you encounter something the PRD + comments + codebase can't resolve: