specpipe 1.0.1 → 1.0.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (46) hide show
  1. package/README.md +111 -311
  2. package/package.json +2 -1
  3. package/src/cli.js +16 -6
  4. package/src/commands/diff.js +1 -1
  5. package/src/commands/init-agents.js +40 -20
  6. package/src/commands/init-global.js +88 -33
  7. package/src/commands/init-interactive.js +71 -0
  8. package/src/commands/init.js +61 -22
  9. package/src/commands/remove.js +159 -49
  10. package/src/commands/upgrade.js +21 -56
  11. package/src/lib/agent-guards.js +34 -78
  12. package/src/lib/agent-install.js +38 -25
  13. package/src/lib/agents.js +53 -11
  14. package/src/lib/claude-global.js +50 -77
  15. package/src/lib/hooks.js +203 -0
  16. package/src/lib/installer.js +73 -61
  17. package/src/lib/reconcile.js +13 -8
  18. package/templates/{.claude/hooks → hooks}/file-guard.js +26 -21
  19. package/templates/hooks/specpipe-read-guard.sh +94 -21
  20. package/templates/hooks/specpipe-shell-guard.sh +121 -29
  21. package/templates/rules/specpipe-rules.md +77 -0
  22. package/templates/skills/sp-build/SKILL.md +101 -1
  23. package/templates/skills/sp-build-behavior-matrix/SKILL.md +876 -0
  24. package/templates/skills/sp-challenge/SKILL.md +34 -0
  25. package/templates/skills/sp-challenge-behavior-matrix/SKILL.md +289 -0
  26. package/templates/skills/sp-explore/SKILL.md +132 -0
  27. package/templates/skills/sp-explore-behavior-matrix/SKILL.md +862 -0
  28. package/templates/skills/sp-fix/SKILL.md +73 -1
  29. package/templates/skills/sp-fix-behavior-matrix/SKILL.md +338 -0
  30. package/templates/skills/sp-investigate/SKILL.md +70 -0
  31. package/templates/skills/sp-investigate-behavior-matrix/SKILL.md +718 -0
  32. package/templates/skills/sp-plan/SKILL.md +90 -0
  33. package/templates/skills/sp-plan-behavior-matrix/SKILL.md +1037 -0
  34. package/templates/skills/sp-review/SKILL.md +29 -3
  35. package/templates/skills/sp-review-behavior-matrix/SKILL.md +294 -0
  36. package/templates/.claude/CLAUDE.md +0 -79
  37. package/templates/.claude/hooks/path-guard.sh +0 -118
  38. package/templates/.claude/hooks/self-review.sh +0 -27
  39. package/templates/.claude/hooks/sensitive-guard.sh +0 -227
  40. package/templates/.claude/settings.json +0 -68
  41. package/templates/docs/WORKFLOW.md +0 -325
  42. package/templates/docs/specs/.gitkeep +0 -0
  43. package/templates/rules/specpipe-guards.md +0 -40
  44. package/templates/scripts/test-hooks.sh +0 -66
  45. /package/templates/{.claude/hooks → hooks}/comment-guard.js +0 -0
  46. /package/templates/{.claude/hooks → hooks}/glob-guard.js +0 -0
@@ -0,0 +1,718 @@
1
+ ---
2
+ description: Read-only root-cause investigation — OPTIONAL branch before /sp-fix. Produces an investigation report with potential root cause hypotheses, evidence, blast radius — no code changes. Use when bug is complex, ambiguous, production-critical, or user explicitly wants to diagnose before fixing (outage, data corruption, regression, unclear stack trace, "it was working yesterday"). Skip for trivial bugs — go straight to /sp-fix. Writes docs/investigate/<slug>-YYYY-MM-DD.md and hands off to /sp-fix.
3
+ allowed-tools: Read, Write, Bash, Glob, Grep, AskUserQuestion, mcp__graphatlas__*
4
+ ---
5
+ Deep investigation — find root cause, map blast radius, report without changing code.
6
+
7
+ Target: $ARGUMENTS
8
+
9
+ ---
10
+
11
+ ## Scope
12
+
13
+ This skill **investigates only**. It does not write tests, fix code, or edit any file.
14
+
15
+ Output: a structured report with root cause hypothesis, evidence, blast radius,
16
+ and actionable next steps for whoever will fix it (human or `/sp-fix`).
17
+
18
+ ```
19
+ Allowed: Read, Grep, Glob, Bash (read-only: git log, git diff, git blame, find, cat, wc, etc.)
20
+ Write — ONLY to docs/investigate/<slug>-<date>.md (the handoff report)
21
+ Blocked: Edit (any existing file), Write outside docs/investigate/,
22
+ Bash (any command that modifies source/config/data, installs packages, or touches shared state)
23
+ ```
24
+
25
+ ## Adaptive Depth
26
+
27
+ This skill auto-scales based on what it finds. No upfront mode selection needed.
28
+
29
+ ```
30
+ Context signal (from $ARGUMENTS):
31
+ - Mentions "production", "outage", "data loss", "corruption"
32
+ → bias toward deeper investigation, full blast radius
33
+ - Mentions "UI", "minor", "cosmetic", "styling"
34
+ → bias toward early exit once root cause is clear
35
+
36
+ Phase 2 (Locate) finds root cause with HIGH confidence?
37
+ → Skip Phase 3 (pattern match)
38
+ → Jump to Phase 4 (form hypothesis) → Phase 5 (blast radius) → report
39
+ → Investigation naturally short (~5 min)
40
+
41
+ Phase 2 unclear, Phase 3 pattern match helps?
42
+ → Standard depth
43
+ → Investigation ~10-15 min
44
+
45
+ Phase 3 also unclear, 3-strike rule hit?
46
+ → Report INSUFFICIENT_EVIDENCE with everything gathered
47
+ → Don't spin past 15 min total
48
+
49
+ Impact is clearly ISOLATED (1 function, ≤2 callers)?
50
+ → Phase 5 simplified: skip diagram, list direct impacts only
51
+
52
+ Impact is MODULE or wider?
53
+ → Phase 5 full: diagram + blast radius + similar risk scan
54
+ ```
55
+
56
+ **Soft timebox guidance:** If stuck > 5 min on any single phase → consider moving
57
+ to next phase with partial findings. Don't let one phase consume the entire budget.
58
+
59
+ ---
60
+
61
+ ## Iron Law
62
+
63
+ **Follow the evidence. Never start with a theory.**
64
+
65
+ Premature hypotheses cause tunnel vision. Gather facts first, then form a theory that explains ALL facts — not just the convenient ones.
66
+
67
+ ---
68
+
69
+ ## Phase 0a — Graphatlas probe (run once, silently)
70
+
71
+ Before Phase 1, probe whether graphatlas (GA) is connected:
72
+
73
+ 1. Call `mcp__graphatlas__ga_architecture` with `max_modules: 1`.
74
+ 2. Interpret:
75
+ - Returns `modules` → **GA available.** Use `ga_*` for every locate / blast-radius step below. Grep is fallback.
76
+ - Error `STALE_INDEX` → call `mcp__graphatlas__ga_reindex` (mode `"full"`), retry once, then treat as available. (This skill is read-only, so no further reindex is needed during the run.)
77
+ - Tool not found / connection error / any other failure → **GA unavailable.** Use grep/glob throughout. Do not re-probe.
78
+ 3. Carry the outcome through Phases 1-5.
79
+
80
+ ---
81
+
82
+ ## Phase 1: Understand the Report
83
+
84
+ Parse what you're given. Clarify what you're not.
85
+
86
+ **Extract these from `$ARGUMENTS`:**
87
+
88
+ | Field | Required | If Missing |
89
+ |-------|----------|------------|
90
+ | Symptom | Yes | Cannot proceed — ask |
91
+ | Expected behavior | Yes | Cannot proceed — ask |
92
+ | Actual behavior | Yes | Cannot proceed — ask |
93
+ | Repro steps | Helpful | Attempt to infer from code; flag as assumption |
94
+ | Environment | Helpful | Assume production-like; flag as assumption |
95
+ | Frequency | Helpful | Assume consistent; flag if intermittent evidence found |
96
+
97
+ If 2+ required fields are missing → ask ONE question via `AskUserQuestion`:
98
+
99
+ ```json
100
+ {
101
+ "questions": [{
102
+ "question": "I need more context to investigate. What's happening?",
103
+ "header": "Bug context",
104
+ "multiSelect": false,
105
+ "options": [
106
+ {"label": "Describe behavior", "description": "What did you expect vs what actually happened"},
107
+ {"label": "Paste error", "description": "Error message, stack trace, or screenshot description"},
108
+ {"label": "Point to code", "description": "Specific file, function, or feature area to investigate"}
109
+ ]
110
+ }]
111
+ }
112
+ ```
113
+
114
+ **Do NOT proceed past Phase 1 without clear symptom + expected + actual.**
115
+
116
+ ### 1.5 — Behavior Matrix context
117
+
118
+ If the report mentions status/state, role/viewer, list/detail/worklist/dashboard/feed/API/email/calendar, notification, external provider, or cross-module inconsistency, look for a related spec with `## Behavior Matrix`. Use the invariant registry README/schema as base knowledge; README examples are not runtime entries. Then read project-local invariant entries if present:
119
+
120
+ - `docs/specs/<feature>/<feature>.md`
121
+ - `docs/invariants/INV-*.md`
122
+
123
+ Record the current mapping hypothesis. This is allowed to be partial until Phase 4:
124
+
125
+ ```
126
+ BM CONTEXT
127
+ ═══════════════════════════════
128
+ State/status: <state or transition | unknown>
129
+ Viewer/role: <actor/viewer/relationship | unknown>
130
+ Surface/path: <list/detail/API/feed/calendar/etc. | unknown>
131
+ Matrix cell: BM.AS-NNN.<surface> | GAP-NNN | N/A:<reason> | NO_CELL | unknown
132
+ Invariant match: <INV/C id + status | invariant text | none | no registry found>
133
+ ```
134
+
135
+ If no spec or invariant registry exists, continue. Do not invent one during investigation; report the absence as a gap if it matters. If the bug confirms a repeated lifecycle/parity/cascade rule, end the report with `Invariant action needed: add/update invariant: ...`.
136
+
137
+ ### 1.6 — Sibling Discovery Pass (candidate only)
138
+
139
+ Run this for lifecycle/parity/cascade bugs, existing-operation investigations, or any report whose symptom names one surface but the operation may exist on sibling entry-points.
140
+
141
+ Purpose: diagnose blast radius before deciding root cause. This produces candidates, not requirements or fixes.
142
+
143
+ 1. Seed nouns/verbs from the raw symptom, touched component, related spec/BM context, and matching invariant entries.
144
+ 2. Find shared-anchor callers (`ga_callers` if GA is available; otherwise grep) for helpers/constants/schemas that define the operation.
145
+ 3. Fuzzy-search parallel names such as `create_from_*`, `*_from_<source>`, `send_*invite*`, `*_outcome*`, `reschedule*`, `book_next*`, `cancel*`, `delete*`, plus domain verbs from the symptom.
146
+ 4. Inspect recent git co-change around touched files (`git log --name-only -- <seed-file>`) for repeatedly paired files/functions.
147
+
148
+ Record a `Sibling Candidate Table` in the investigation output:
149
+
150
+ | Candidate | Operation | Evidence | Confidence | Investigation disposition |
151
+ |---|---|---|---|---|
152
+ | `<surface/path/symbol>` | same create/update/delete/send/read op? | ga_callers / grep / co-change / invariant / symptom | high / medium / low | likely-related / needs-spec-GAP / ignore(reason) |
153
+
154
+ Do not auto-fix candidates. `likely-related` means the candidate belongs in the root-cause/blast-radius analysis. `needs-spec-GAP` means the report exposes an underspecified sibling. `ignore(reason)` must name why the candidate is false positive or out of scope.
155
+
156
+ ---
157
+
158
+ ## Phase 2: Locate
159
+
160
+ Find where the bug lives. Work from the outside in.
161
+
162
+ ### 2.1 — Entry Point Search
163
+
164
+ Start with the most specific artifact available, in priority order:
165
+
166
+ | Have This? | Search Strategy |
167
+ |------------|----------------|
168
+ | Error message / stack trace | Grep exact error string → follow call stack |
169
+ | Function or class name | Grep definition → read implementation |
170
+ | Feature/screen name | Grep for route/handler/view name → trace to logic |
171
+ | Only vague description | Grep keywords → read surrounding code → narrow |
172
+
173
+ > **If GA available (per Phase 0a):** `ga_symbols("<function or type>")` for definitions (ranked by caller count — picks the popular def when names collide), then `ga_callers` / `ga_callees` to map the call graph; `ga_impact(symbol=...)` for a whole-feature view. **If GA unavailable, or the query is free-text error string inside a literal:** use the grep recipes below.
174
+
175
+ ```bash
176
+ # Extension set covers ~90% of mainstream code:
177
+ # JS/TS family, Python, Ruby, Go, Rust, Java/Kotlin/Scala/Groovy, Swift/ObjC,
178
+ # C/C++/C#, PHP, Dart, Elixir, Erlang, Haskell, Clojure, Elm, R, Julia,
179
+ # Zig, Nim, PowerShell, shell, SQL, web templates
180
+ EXT='*.{js,jsx,ts,tsx,mjs,cjs,vue,svelte,py,rb,go,rs,java,kt,kts,scala,groovy,swift,m,mm,c,cc,cpp,cxx,h,hh,hpp,cs,php,dart,lua,ex,exs,erl,hs,clj,cljs,elm,r,jl,zig,nim,ps1,sh,bash,zsh,sql,erb,html}'
181
+
182
+ # Error message → find origin
183
+ grep -rn "exact error text" --include="$EXT" .
184
+
185
+ # Function → find definition + callers
186
+ grep -rn "function_name" --include="$EXT" .
187
+
188
+ # Feature → find entry point
189
+ grep -rn "route\|handler\|endpoint\|view.*FeatureName" .
190
+ ```
191
+
192
+ ### 2.2 — Check for Recurring Bugs
193
+
194
+ Before diving deep, check if this area has a history of bugs:
195
+
196
+ ```bash
197
+ # How often has this file been fixed?
198
+ git log --oneline --all -- <affected-file> | grep -i "fix\|bug\|patch\|hotfix" | head -10
199
+
200
+ # How many authors have touched this file recently?
201
+ git shortlog -sn --since="6 months ago" -- <affected-file>
202
+ ```
203
+
204
+ **Recurring bug signal:** If the same file/module shows 3+ bug-fix commits targeting the **same function or same bug pattern** in recent history → this is likely an **architectural smell**, not a one-off bug. Flag this:
205
+
206
+ ```
207
+ ⚠️ RECURRING BUG AREA: <file:function> has N fix commits in last M months
208
+ Pattern: <what keeps breaking — same null check? same race? same state issue?>
209
+ Implication: root cause may be structural (wrong abstraction, missing invariant,
210
+ unclear ownership) rather than a simple code error
211
+ ```
212
+
213
+ Note: 3 fixes in the same FILE but targeting completely different functions/concerns
214
+ is normal churn, not a smell. The signal is repeated fixes for the SAME pattern.
215
+
216
+ ### 2.3 — Trace the Data Flow
217
+
218
+ Starting from the entry point, trace forward through the code:
219
+
220
+ ```
221
+ INPUT → Where does the data enter?
222
+ → TRANSFORM → What functions process it?
223
+ → DECISION → What branches/conditions control flow?
224
+ → OUTPUT → Where does the result surface to the user?
225
+ → SIDE EFFECTS → What else happens? (DB write, cache update, event emit)
226
+ ```
227
+
228
+ At each step, note:
229
+ - What type is the data? Can it be null/nil/None/undefined here?
230
+ - What assumptions does this code make about its input?
231
+ - Are there error paths? Do they swallow errors silently?
232
+
233
+ **Tentative hypotheses are fine.** You will naturally form theories while tracing.
234
+ That's good — note them. But don't commit to a hypothesis until the full causal chain
235
+ (location → mechanism → symptom) is verified with code evidence. The Iron Law says
236
+ "follow evidence first" — not "suppress all intuition."
237
+
238
+ ### 2.4 — Check History
239
+
240
+ ```bash
241
+ # Recent changes to affected files
242
+ git log --oneline -20 -- <affected-files>
243
+
244
+ # What changed in the last commit that touched this file?
245
+ git log -1 -p -- <affected-file>
246
+
247
+ # When was this line last changed? By whom?
248
+ git blame -L <start>,<end> -- <affected-file>
249
+
250
+ # Was this file recently refactored?
251
+ git log --oneline --diff-filter=M -10 -- <affected-file>
252
+ ```
253
+
254
+ **Regression signal:** If the behavior worked before and a recent commit changed the affected code → the bug is likely in that diff. Flag this:
255
+ ```
256
+ ⚠️ REGRESSION SIGNAL: <commit-hash> (<date>) — <commit message>
257
+ Changed: <file:lines>
258
+ Before: <old behavior>
259
+ After: <new behavior>
260
+ ```
261
+
262
+ ---
263
+
264
+ ## Phase 3: Pattern Match (when needed)
265
+
266
+ **Skip this phase if:** Phase 2 already produced a HIGH confidence hypothesis
267
+ with complete causal chain (location + mechanism + evidence). Jump to Phase 4.
268
+
269
+ **Use this phase when:**
270
+ - Symptom is unclear or ambiguous
271
+ - Data flow trace didn't reveal obvious cause
272
+ - Investigation is stuck — need a framework to think through
273
+ - Bug is non-obvious or intermittent
274
+
275
+ Match the observed symptom against known bug patterns.
276
+ Don't mechanically check every row — scan for patterns that FIT the evidence you have.
277
+
278
+ | # | Pattern | Signature | Investigation Steps |
279
+ |---|---------|-----------|-------------------|
280
+ | 1 | **Nil/null propagation** | TypeError, NullPointerException, "undefined is not a function", unwrap on None | Trace value backwards from crash site → find where it becomes nil. Check: is there a guard? Is the guard in the wrong place? |
281
+ | 2 | **Race condition** | Intermittent, timing-dependent, "works locally", flaky test | Find shared mutable state. Check: multiple concurrent accessors? Missing lock/mutex/actor isolation? |
282
+ | 3 | **State corruption** | Inconsistent data, partial update visible, "impossible" state | Find state mutation points. Check: transaction boundary? Cleanup after error? Multiple writers? |
283
+ | 4 | **Off-by-one / boundary** | Wrong count, missing last item, extra item, index out of bounds | Find loop/slice/range. Check: `<` vs `<=`? 0-indexed vs 1-indexed? Empty collection handled? |
284
+ | 5 | **Type coercion / cast** | Wrong value type, unexpected string "null", NaN, "0" vs 0 | Find type boundaries (JSON parse, DB query, API response). Check: implicit conversion? Missing validation? |
285
+ | 6 | **Stale data** | Shows old data, fixes on refresh/restart, cache-related | Find cache layers (memory, Redis, CDN, browser). Check: invalidation after write? TTL too long? |
286
+ | 7 | **Configuration drift** | Works locally, fails in staging/prod | Compare env vars, feature flags, DB schema, API versions across environments |
287
+ | 8 | **Silent error swallow** | No error shown but wrong behavior | Grep for empty catch blocks, `_ =`, `catch {}`, `.catch(() => {})`. Check: error logged but not propagated? |
288
+ | 9 | **Ordering / timing** | Depends on execution order, async operations complete out of order | Find async operations. Check: await missing? Race between promises/tasks? Event ordering assumed? |
289
+ | 10 | **Resource leak** | Gradually degrades, OOM, connection pool exhausted, file descriptor limit | Find open/acquire without close/release. Check: error path also closes? Loop creates without releasing? |
290
+ | 11 | **Incorrect merge / conflict resolution** | Bug appears after merge, code has conflicting logic | `git log --merges -5 -- <file>`. Check: merge conflict resolved incorrectly? Both sides kept when one should win? |
291
+ | 12 | **API contract mismatch** | Caller sends X, receiver expects Y | Find both sides of the boundary. Check: field names match? Types match? Optional vs required? |
292
+ | 13 | **Lifecycle / viewer / surface parity** | Correct on one status/role/surface but wrong on another | Map state x viewer x surface; compare write model, read models, queues, dashboard counts, feed, APIs, notifications, calendar |
293
+ | 14 | **Cascade propagation gap** | Write succeeds but derived surfaces are stale/missing | Trace write side effects into projections, cache invalidation, event handlers, queues, external integrations |
294
+ | 15 | **External-down divergence** | Internal state updates but provider/email/calendar state is wrong or invisible | Trace retry queue, provider status, user-visible retry surface, idempotency key |
295
+
296
+ For each matching pattern, record:
297
+ ```
298
+ PATTERN MATCH: #N <name>
299
+ Evidence: <specific code/log that matches this pattern>
300
+ Confidence: HIGH / MEDIUM / LOW
301
+ ```
302
+
303
+ ### External Search (when no pattern matches)
304
+
305
+ If the bug doesn't match any known pattern above, and the error message or behavior
306
+ is unfamiliar, search externally:
307
+
308
+ ```
309
+ Search: "{framework} {sanitized error type}"
310
+ Search: "{library} {component} known issues"
311
+ ```
312
+
313
+ **⚠️ SANITIZE BEFORE SEARCHING:**
314
+ Strip from the error message before using as search query:
315
+ - Hostnames, IPs, internal URLs
316
+ - File paths containing usernames or project names
317
+ - SQL fragments, query parameters
318
+ - Customer data, user IDs, email addresses
319
+ - API keys, tokens, secrets (obviously)
320
+
321
+ Search the **generic error type and framework context**, not the raw message.
322
+
323
+ If search reveals a documented bug or known issue → record as a candidate hypothesis
324
+ in Phase 4 with source link.
325
+
326
+ ---
327
+
328
+ ## Phase 4: Form Hypothesis
329
+
330
+ Based on evidence from Phases 2-3, form a **specific, testable** hypothesis.
331
+
332
+ ### Requirements for a Valid Hypothesis
333
+
334
+ ```
335
+ A valid hypothesis MUST:
336
+ ✓ Name a specific location (file:line or function)
337
+ ✓ Describe WHAT is wrong (the mechanism)
338
+ ✓ Explain WHY it produces the observed symptom
339
+ ✓ Be falsifiable (describe what evidence would DISPROVE it)
340
+
341
+ A hypothesis MUST NOT:
342
+ ✗ Be vague ("something is wrong with the cache")
343
+ ✗ Name a symptom as a cause ("it crashes because of a null pointer")
344
+ → WHY is the pointer null?
345
+ ✗ Require assumptions not grounded in code evidence
346
+ ```
347
+
348
+ ### Format
349
+
350
+ ```
351
+ HYPOTHESIS
352
+ ══════════
353
+ Location: <file:line or file:function>
354
+ Mechanism: <what is going wrong, mechanically>
355
+ Chain: <input> → <step 1> → <step 2> → ... → <symptom>
356
+ Disproof: <what evidence would prove this wrong>
357
+ Confidence: HIGH / MEDIUM / LOW
358
+ Basis: <list evidence that supports this>
359
+ Behavior Matrix:
360
+ State/status: <state or transition>
361
+ Viewer/role: <viewer/relationship>
362
+ Surface/path: <surface>
363
+ Cell: BM.AS-NNN.<surface> | GAP-NNN | N/A:<reason> | NO_CELL
364
+ Spec gap: none | gap-open | suspicious-N/A | missing-cell
365
+ Invariant: <matched invariant | new invariant candidate | none>
366
+ ```
367
+
368
+ ### Confidence Levels
369
+
370
+ | Level | Definition | Threshold |
371
+ |-------|-----------|-----------|
372
+ | **HIGH** | Traced complete chain from cause to symptom in code. Regression commit identified. Or: reproduced deterministically. | Can explain every step with code references |
373
+ | **MEDIUM** | Strong circumstantial evidence. Chain mostly traced but 1-2 gaps remain. Pattern match is strong. | Most steps have code references, some inferred |
374
+ | **LOW** | Plausible theory consistent with symptoms but significant gaps in evidence. Multiple alternative explanations possible. | Theory fits but lacks direct code proof |
375
+
376
+ If confidence is LOW → do NOT present as finding. Continue investigating or report INSUFFICIENT_EVIDENCE.
377
+
378
+ ### Hypothesis Verification Suggestions
379
+
380
+ For each hypothesis, describe HOW it can be verified without changing code:
381
+
382
+ ```
383
+ VERIFICATION PLAN
384
+ ═════════════════
385
+ To confirm this hypothesis:
386
+ 1. <read-only step — e.g., "check value of X at runtime via existing logs">
387
+ 2. <read-only step — e.g., "grep for other callers of this function to see if they hit same path">
388
+ 3. <read-only step — e.g., "compare git blame output with the date the bug was first reported">
389
+
390
+ If read-only verification is insufficient:
391
+ Instrumentation suggestion: <e.g., "add temporary log at file:line to capture value of X">
392
+ ⚠️ This requires code change — note for whoever implements the fix.
393
+ ```
394
+
395
+ ### 3-Strike Rule
396
+
397
+ If 3 hypotheses are formed and NONE can be supported to MEDIUM+ confidence → **STOP**.
398
+
399
+ Use `AskUserQuestion`:
400
+
401
+ ```json
402
+ {
403
+ "questions": [{
404
+ "question": "3 hypotheses investigated, none confirmed to medium+ confidence. How to proceed?",
405
+ "header": "Stalled",
406
+ "multiSelect": false,
407
+ "options": [
408
+ {"label": "New evidence", "description": "I have additional context that might help (describe it)"},
409
+ {"label": "Instrument", "description": "Add logging to the affected area, catch it next time"},
410
+ {"label": "Report as-is", "description": "Publish findings so far with INSUFFICIENT_EVIDENCE status"}
411
+ ]
412
+ }]
413
+ }
414
+ ```
415
+
416
+ Do NOT keep spinning. 3 strikes = escalate or report partial findings.
417
+
418
+ ### Multiple Hypotheses
419
+
420
+ If evidence supports 2+ plausible root causes:
421
+
422
+ ```
423
+ HYPOTHESIS A (PRIMARY — HIGH confidence)
424
+ Location: ...
425
+ Mechanism: ...
426
+
427
+ HYPOTHESIS B (ALTERNATIVE — MEDIUM confidence)
428
+ Location: ...
429
+ Mechanism: ...
430
+ Why less likely: <specific reason A is preferred over B>
431
+ ```
432
+
433
+ Rank by confidence. Maximum 3 hypotheses — if you have more, you haven't narrowed enough.
434
+
435
+ ---
436
+
437
+ ## Phase 5: Map Blast Radius
438
+
439
+ Determine what else is affected. This informs fix priority and scope.
440
+
441
+ ### 5.0 — Declare Investigation Scope
442
+
443
+ Before mapping blast radius, declare the narrowest scope containing the bug:
444
+
445
+ ```
446
+ INVESTIGATION SCOPE
447
+ ═══════════════════════════════
448
+ Primary: <directory or module containing root cause>
449
+ Secondary: <directories containing direct callers/dependents>
450
+ Out of scope: <what was NOT investigated, and why>
451
+ ```
452
+
453
+ This helps whoever fixes the bug understand what was examined and what wasn't.
454
+
455
+ ### 5.1 — Bug Path Diagram (skip if ISOLATED)
456
+
457
+ **If impact scope is clearly ISOLATED** (bug in 1 function, ≤2 direct callers,
458
+ no shared state, no persistence side effects):
459
+ → Skip diagram. List direct impacts in 2-3 bullet points.
460
+
461
+ **If impact scope is MODULE or wider:**
462
+ → Draw full diagram:
463
+
464
+ ```
465
+ BUG PATH DIAGRAM
466
+ ═══════════════════════════
467
+ [+] <file>
468
+
469
+ └── affectedFunction()
470
+ ├── [★★ TESTED] Normal path — test_file:12
471
+ ├── [BUG] <edge case> (← root cause here)
472
+ │ ├── [GAP] <downstream effect 1> — NO TEST
473
+ │ └── [GAP] <downstream effect 2> — NO TEST
474
+ ├── [★★ TESTED] Other branch — test_file:20
475
+ └── [→MANUAL] View/UI rendering — visual verification only
476
+
477
+ Legend:
478
+ [★★ TESTED] = has test coverage
479
+ [BUG] = root cause location
480
+ [GAP] = no test, affected by bug
481
+ [→MANUAL] = UI/visual, cannot automate
482
+ [UNCLEAR] = couldn't determine coverage, needs human check
483
+ ```
484
+
485
+ > **If GA available, lean on it for blast radius.** `ga_impact(symbol=...)` is the one-shot tool — returns impacted files, tests, routes, and a runtime risk score. Pair with `ga_callers` / `ga_callees` for the call graph and `ga_architecture` to identify the module/layer (auth, payment, core). More accurate than grep — uses typed CALL/REFERENCES edges and resolves polymorphic dispatch. If GA is unavailable, fall back to grep + manual file reading.
486
+
487
+ ### 5.2 — Impact Scope
488
+
489
+ ```
490
+ BLAST RADIUS
491
+ ═══════════════════════════
492
+ Direct impact:
493
+ - <file:function> — <what goes wrong>
494
+ - <file:function> — <what goes wrong>
495
+
496
+ Indirect impact (callers of affected code):
497
+ - <file:function> calls <affected> → may see <effect>
498
+ - <file:function> calls <affected> → may see <effect>
499
+
500
+ Data impact:
501
+ - <table/collection> — could have <inconsistent state>
502
+ - <cache key> — could serve <stale data>
503
+
504
+ User-facing impact:
505
+ - <feature/screen> — user sees <wrong behavior>
506
+ - <API endpoint> — returns <wrong response>
507
+
508
+ Behavior Matrix impact:
509
+ - <BM.AS-NNN.surface> — <broken state/viewer/surface behavior>
510
+ - <GAP-NNN or NO_CELL> — <spec hole exposed by bug>
511
+
512
+ Impact scope: ISOLATED | MODULE | CROSS-MODULE | SYSTEM-WIDE
513
+ ```
514
+
515
+ ### 5.3 — Similar Risk Scan (skip if ISOLATED + unique pattern)
516
+
517
+ **Skip if:** bug is ISOLATED AND the code pattern is unique to this location
518
+ (not a repeated idiom). Note: "scan skipped — pattern unique to this location."
519
+
520
+ **Run if:** the bug pattern could plausibly exist elsewhere (e.g., missing null check
521
+ on API response, unguarded concurrent access, cache not invalidated after write).
522
+
523
+ Grep for the same pattern elsewhere. Timebox: 5 minutes max.
524
+
525
+ ```bash
526
+ # Example: if bug is a missing null check on API response
527
+ grep -rn "\.data\." --include="*.ts" . | grep -v "?\.data\.\|\.data &&\|\.data !=\|\.data !=="
528
+ # → finds other places accessing .data without null check
529
+ ```
530
+
531
+ **Beyond grep — think at design level:**
532
+ - Same abstraction used elsewhere? (e.g., other repositories using same base class)
533
+ - Same API contract reused? (e.g., other endpoints making same assumption about response shape)
534
+ - Same concurrency pattern repeated? (e.g., other handlers doing read-modify-write without lock)
535
+ - Same cache pattern? (e.g., other services writing without invalidation)
536
+
537
+ Grep catches syntax-level repetition. Design-level thinking catches same-class-of-bug
538
+ in different code that looks nothing alike syntactically.
539
+
540
+ Record findings:
541
+ ```
542
+ SIMILAR RISK
543
+ ═══════════════════════════
544
+ Same pattern found at:
545
+ - <file:line> — <description>
546
+ - <file:line> — <description>
547
+ - (none found — pattern is unique to this location)
548
+
549
+ Scan scope: <what was searched, what pattern>
550
+ Timebox: 5 minutes (do not let this block the report)
551
+ ```
552
+
553
+ ---
554
+
555
+ ## Phase 6: Recommend Next Steps
556
+
557
+ Based on the investigation, recommend specific actions.
558
+
559
+ ```
560
+ RECOMMENDED ACTIONS
561
+ ═══════════════════════════
562
+
563
+ 1. [CRITICAL] <action — specific file, specific change>
564
+ Reason: <why this is needed>
565
+ Estimated scope: <N files, complexity LOW/MEDIUM/HIGH>
566
+
567
+ 2. [HIGH] <action>
568
+ Reason: ...
569
+
570
+ 3. [MEDIUM] <action>
571
+ Reason: ...
572
+
573
+ Test strategy:
574
+ - Regression test: <what to test, at what level (unit/integration)>
575
+ - Behavior Matrix regression: <BM.AS-NNN.surface test name, or "spec gap before test">
576
+ - Existing tests to verify: <list test names that should still pass>
577
+ - Manual verification: <what to check visually, if applicable>
578
+
579
+ Suggested fix approach:
580
+ □ Minimal fix (patch the specific bug) — use when blast radius is ISOLATED
581
+ □ Targeted refactor (fix pattern across affected module) — use when SIMILAR RISK has 3+ hits
582
+ □ Architectural fix (redesign the interaction) — use when root cause is structural
583
+
584
+ → To fix: run `/sp-fix <paste root cause summary>`
585
+ ```
586
+
587
+ ---
588
+
589
+ ## Output: Investigation Report
590
+
591
+ **Omit empty sections.** If a section has no meaningful content for this investigation,
592
+ leave it out entirely. A 5-section report for a simple bug is better than a 12-section
593
+ report with 7 empty sections.
594
+
595
+ ```
596
+ INVESTIGATION REPORT
597
+ ════════════════════════════════════════════════════════════════
598
+
599
+ Target: <what was investigated>
600
+ Date: <date>
601
+ Status: ROOT_CAUSE_FOUND | PROBABLE_CAUSE | INSUFFICIENT_EVIDENCE | BLOCKED
602
+
603
+ ─── SUMMARY ───
604
+ <2-3 sentences: what's wrong, why, and what to do about it>
605
+
606
+ ─── SYMPTOM ───
607
+ Expected: <what should happen>
608
+ Actual: <what happens instead>
609
+ Frequency: <always / intermittent / under specific conditions>
610
+
611
+ ─── ROOT CAUSE ───
612
+ HYPOTHESIS A (PRIMARY — <confidence>)
613
+ Location: <file:line>
614
+ Mechanism: <what is wrong>
615
+ Chain: <cause> → <step> → ... → <symptom>
616
+ Behavior Matrix:
617
+ State/status: <state or transition>
618
+ Viewer/role: <viewer/relationship>
619
+ Surface/path: <surface>
620
+ Cell: BM.AS-NNN.<surface> | GAP-NNN | N/A:<reason> | NO_CELL
621
+ Spec gap: none | gap-open | suspicious-N/A | missing-cell
622
+ Invariant: <matched invariant | new invariant candidate | none>
623
+ Evidence:
624
+ - <file:line> — <what this code shows>
625
+ - <git commit> — <what this change reveals>
626
+ - <log/output> — <what this data proves>
627
+ Disproof: <what would prove this wrong>
628
+
629
+ (HYPOTHESIS B if applicable)
630
+
631
+ ─── POTENTIAL GAPS ───
632
+ Risks discovered during investigation that may not be the root cause of THIS bug,
633
+ but represent risks for future bugs:
634
+
635
+ - <file:line> — Missing guard/validation: <what's unprotected>
636
+ - <file:line> — Assumption not enforced: <what assumption could break>
637
+ - <file:function> — No test coverage for: <path/branch>
638
+ - <file:function> — Fragile pattern: <why this could easily break again>
639
+ - (none discovered — investigation scope was clean)
640
+
641
+ These are inputs for refactor/tech-debt decisions, not immediate fixes.
642
+
643
+ ─── REGRESSION? ───
644
+ <Yes — commit <hash> introduced this on <date> | No — pre-existing | Unknown>
645
+
646
+ ─── RECURRING? ───
647
+ <Yes — N fix commits in this area in last M months. Architectural smell suspected.
648
+ Pattern: <what keeps breaking> | No — first known bug in this area>
649
+
650
+ ─── BUG PATH ───
651
+ (omit if ISOLATED)
652
+ <Bug Path Diagram from Phase 5.1>
653
+
654
+ ─── BLAST RADIUS ───
655
+ Scope: <ISOLATED | MODULE | CROSS-MODULE | SYSTEM-WIDE>
656
+ <Impact details from Phase 5.2>
657
+
658
+ ─── BEHAVIOR MATRIX IMPACT ───
659
+ Cells:
660
+ - <BM.AS-NNN.surface> — <affected behavior>
661
+ - <GAP-NNN / NO_CELL> — <spec hole if found>
662
+ State/viewer/surface class: <lifecycle | viewer-parity | surface-parity | cascade | external-down | other>
663
+ Spec action needed: <none | resolve GAP | add matrix cell | correct suspicious N/A | update AS wording>
664
+ Invariant action needed: <none | add/update invariant: ...>
665
+
666
+ ─── SIMILAR RISK ───
667
+ (omit if scan skipped)
668
+ <Findings from Phase 5.3>
669
+
670
+ ─── RECOMMENDED ACTIONS ───
671
+ <From Phase 6>
672
+
673
+ ─── OPEN QUESTIONS ───
674
+ (omit if investigation is complete)
675
+ <Anything that couldn't be determined from code alone>
676
+ - <question — what additional info would help>
677
+ - <question — what test/experiment would clarify>
678
+
679
+ ════════════════════════════════════════════════════════════════
680
+ ```
681
+
682
+ ---
683
+
684
+ ## Handoff File
685
+
686
+ After producing the report, write it to `docs/investigate/<slug>-$(date +%Y-%m-%d).md` so `/sp-fix` can auto-detect it and skip redundant discovery.
687
+
688
+ - `<slug>` = kebab-case of the bug subject (e.g. `order-cancel-500`, `login-redirect-loop`)
689
+ - If a file for the same slug+date already exists → append `-NN` suffix (`...-2026-04-21-02.md`)
690
+ - The file contains the full Investigation Report block above, no wrapping prose
691
+ - Mention the path at the end of the chat response so the user can open it
692
+
693
+ Writing this file is the ONLY write operation this skill performs — it is output, not a code change. If `docs/investigate/` does not exist, create it.
694
+
695
+ After writing, signal handoff:
696
+
697
+ ```
698
+ ⚠️ Ready to fix — run `/sp-fix docs/investigate/<slug>-<date>.md`
699
+ (or paste root cause summary directly if you prefer to skip the file)
700
+ ```
701
+
702
+ ---
703
+
704
+ ## Rules
705
+
706
+ 1. **Read only for code.** Never modify source code, tests, configs, or any file outside `docs/investigate/`. The investigation report file is the single allowed write.
707
+ 2. **Evidence over intuition.** Every claim in the report must reference specific code (file:line) or data (git commit, log output).
708
+ 3. **Specific over vague.** "The cache isn't invalidated after write at storage.rs:142" not "there might be a cache issue".
709
+ 4. **Complete the chain.** Root cause → intermediate steps → symptom. No gaps. If there's a gap, say so.
710
+ 5. **Honest confidence.** LOW means LOW. Don't inflate to get past Phase 4. INSUFFICIENT_EVIDENCE is a valid outcome.
711
+ 6. **Timebox.** If after 15 minutes of investigation you can't form a MEDIUM+ hypothesis → report INSUFFICIENT_EVIDENCE with everything gathered so far. Don't spin.
712
+ 7. **One investigation, one report.** If `$ARGUMENTS` describes multiple bugs, investigate the most severe first. Mention others in OPEN QUESTIONS.
713
+
714
+ **Red flags — slow down:**
715
+ - Jumping to a hypothesis before tracing the data flow — you're guessing
716
+ - "It's probably X" without file:line evidence — investigate more
717
+ - Confirming your theory instead of trying to disprove it — confirmation bias
718
+ - Spending 10+ minutes on similar-risk scan — timebox and move on