@gitwhy-cli/whyspec 0.1.17 → 0.1.19
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/dist/adapters/claude-code.d.ts +5 -10
- package/dist/adapters/claude-code.d.ts.map +1 -1
- package/dist/adapters/claude-code.js +55 -115
- package/dist/adapters/claude-code.js.map +1 -1
- package/dist/adapters/codex.d.ts.map +1 -1
- package/dist/adapters/codex.js +12 -9
- package/dist/adapters/codex.js.map +1 -1
- package/dist/adapters/types.d.ts +3 -1
- package/dist/adapters/types.d.ts.map +1 -1
- package/dist/adapters/types.js +15 -13
- package/dist/adapters/types.js.map +1 -1
- package/dist/commands/init.d.ts.map +1 -1
- package/dist/commands/init.js +23 -29
- package/dist/commands/init.js.map +1 -1
- package/package.json +2 -2
- package/skill-sources/whyspec-capture/SKILL.md +241 -0
- package/skill-sources/whyspec-debug/SKILL.md +288 -0
- package/skill-sources/whyspec-execute/SKILL.md +210 -0
- package/skill-sources/whyspec-plan/SKILL.md +331 -0
- package/skill-sources/whyspec-search/SKILL.md +137 -0
- package/skill-sources/whyspec-show/SKILL.md +140 -0
- package/skills/whyspec-capture/SKILL.md +0 -154
- package/skills/whyspec-debug/SKILL.md +0 -404
- package/skills/whyspec-execute/SKILL.md +0 -118
- package/skills/whyspec-plan/SKILL.md +0 -170
- package/skills/whyspec-search/SKILL.md +0 -69
- package/skills/whyspec-show/SKILL.md +0 -90
|
@@ -0,0 +1,140 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: whyspec-show
|
|
3
|
+
description: Use when reviewing the full story of a change — intent, design, tasks, and Decision Bridge delta.
|
|
4
|
+
argument-hint: "[change-name]"
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
Show the full story — from intent through design, tasks, and reasoning — with the Decision Bridge delta.
|
|
8
|
+
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
**Input**: A change name. If omitted, prompt for available changes.
|
|
12
|
+
|
|
13
|
+
## What Makes a Good Story
|
|
14
|
+
|
|
15
|
+
The show command doesn't just dump files — it tells the **narrative arc** of a change:
|
|
16
|
+
1. **Intent** — Why did we start this? What pain existed?
|
|
17
|
+
2. **Design** — How did we plan to solve it? What trade-offs did we consider?
|
|
18
|
+
3. **Tasks** — What concrete work was done? How much is complete?
|
|
19
|
+
4. **Reasoning** — What actually happened? What surprised us?
|
|
20
|
+
5. **Decision Bridge** — How did our thinking evolve from plan to reality?
|
|
21
|
+
|
|
22
|
+
The Decision Bridge delta is the most valuable output — it reveals the gap between what we THOUGHT we'd do and what we ACTUALLY did.
|
|
23
|
+
|
|
24
|
+
## Steps
|
|
25
|
+
|
|
26
|
+
1. **Get the full story from CLI**
|
|
27
|
+
|
|
28
|
+
```bash
|
|
29
|
+
whyspec show --json "<name>"
|
|
30
|
+
```
|
|
31
|
+
|
|
32
|
+
Parse the JSON response:
|
|
33
|
+
- `intent`: Content of intent.md
|
|
34
|
+
- `design`: Content of design.md
|
|
35
|
+
- `tasks`: Content of tasks.md with completion status
|
|
36
|
+
- `context`: Content of ctx_<id>.md (if captured)
|
|
37
|
+
- `decision_bridge_delta`: Computed delta of planned vs actual decisions
|
|
38
|
+
- `surprises`: Decisions not in the original plan
|
|
39
|
+
|
|
40
|
+
If no change name provided:
|
|
41
|
+
- Run `whyspec list --json` to get available changes
|
|
42
|
+
- Let the user select
|
|
43
|
+
|
|
44
|
+
2. **Display the full story as a narrative arc**
|
|
45
|
+
|
|
46
|
+
```
|
|
47
|
+
# <Change Name>
|
|
48
|
+
|
|
49
|
+
## Intent (WHY)
|
|
50
|
+
[From intent.md — problem statement, what it enables, constraints, success criteria]
|
|
51
|
+
|
|
52
|
+
## Design (HOW)
|
|
53
|
+
[From design.md — approach, architecture, trade-offs considered]
|
|
54
|
+
|
|
55
|
+
## Tasks (WHAT)
|
|
56
|
+
[From tasks.md — task list with completion status]
|
|
57
|
+
Progress: N/M tasks complete
|
|
58
|
+
|
|
59
|
+
## Reasoning (AFTER)
|
|
60
|
+
[From ctx_<id>.md — story of what happened, decisions made, trade-offs accepted]
|
|
61
|
+
```
|
|
62
|
+
|
|
63
|
+
If context hasn't been captured yet:
|
|
64
|
+
```
|
|
65
|
+
## Reasoning (AFTER)
|
|
66
|
+
Not yet captured. Run /whyspec:capture to complete the story.
|
|
67
|
+
```
|
|
68
|
+
|
|
69
|
+
3. **Highlight the Decision Bridge Delta**
|
|
70
|
+
|
|
71
|
+
When both plan files and context exist, display the evolution:
|
|
72
|
+
|
|
73
|
+
<examples>
|
|
74
|
+
<good>
|
|
75
|
+
## Decision Bridge
|
|
76
|
+
|
|
77
|
+
| Decision | Planned (Before) | Actual (After) | Delta |
|
|
78
|
+
|----------|-------------------|----------------|-------|
|
|
79
|
+
| Rate limit storage | Redis vs in-memory? | Redis — 3 instances need shared state | As planned |
|
|
80
|
+
| Limit granularity | per-IP vs per-token? | Both — IP for anon, token for auth'd | Expanded scope |
|
|
81
|
+
| 429 response | standard vs custom? | Standard + Retry-After header | As planned |
|
|
82
|
+
|
|
83
|
+
### Surprises (not in original plan)
|
|
84
|
+
- **Added X-Request-ID middleware** — needed for debugging 429 responses.
|
|
85
|
+
Impact: New dependency on uuid package, new middleware in chain.
|
|
86
|
+
- **Changed Redis key schema to sorted sets** — sliding window algorithm
|
|
87
|
+
requires sorted sets, not simple strings. Affects memory profile.
|
|
88
|
+
|
|
89
|
+
### Delta Summary
|
|
90
|
+
- 3 planned decisions: 2 as planned, 1 expanded scope
|
|
91
|
+
- 2 surprises: both additive (no plan contradictions)
|
|
92
|
+
- Design alignment: HIGH — implementation closely followed the plan
|
|
93
|
+
Why good: Every decision shows BEFORE (question) and AFTER (answer + rationale).
|
|
94
|
+
Surprises are documented with impact. Delta summary gives the big picture.
|
|
95
|
+
The "Expanded scope" and "As planned" labels make evolution visible at a glance.
|
|
96
|
+
</good>
|
|
97
|
+
|
|
98
|
+
<bad>
|
|
99
|
+
## Decision Bridge
|
|
100
|
+
|
|
101
|
+
| Decision | Status |
|
|
102
|
+
|----------|--------|
|
|
103
|
+
| Storage | Done |
|
|
104
|
+
| Limit scope | Done |
|
|
105
|
+
| 429 response | Done |
|
|
106
|
+
Why bad: No before/after comparison. "Done" tells you nothing about what
|
|
107
|
+
was decided or why. The entire point of the Decision Bridge is showing
|
|
108
|
+
how thinking evolved.
|
|
109
|
+
</bad>
|
|
110
|
+
</examples>
|
|
111
|
+
|
|
112
|
+
4. **Handle partial stories gracefully**
|
|
113
|
+
|
|
114
|
+
Not every change has all files. Show what exists and guide the user:
|
|
115
|
+
|
|
116
|
+
| Missing | What to show | Suggestion |
|
|
117
|
+
|---------|-------------|------------|
|
|
118
|
+
| No design.md | Intent + Tasks only | "Design not captured. Was this a quick fix?" |
|
|
119
|
+
| No tasks.md | Intent + Design only | "No tasks defined. Run `/whyspec:execute` to start." |
|
|
120
|
+
| No context | Intent + Design + Tasks | "Reasoning not captured yet. Run `/whyspec:capture`." |
|
|
121
|
+
| No intent (shouldn't happen) | Whatever exists | "Intent missing — this change may not have been planned with WhySpec." |
|
|
122
|
+
| Tasks partially done | Show progress bar | "Progress: ███░░ 3/5 tasks — resume with `/whyspec:execute <name>`" |
|
|
123
|
+
|
|
124
|
+
## Tools
|
|
125
|
+
|
|
126
|
+
| Tool | When to use | When NOT to use |
|
|
127
|
+
|------|------------|-----------------|
|
|
128
|
+
| **Bash** | Run `whyspec show --json "<name>"` and `whyspec list --json` | Don't modify anything — this is read-only |
|
|
129
|
+
| **Read** | Read raw files if CLI output is insufficient or truncated | Don't read files the CLI already provided |
|
|
130
|
+
|
|
131
|
+
No AskUserQuestion — this is a read-only display skill. If change name is missing, use `whyspec list --json` and let the user select.
|
|
132
|
+
|
|
133
|
+
## Guardrails
|
|
134
|
+
|
|
135
|
+
- **Show all available phases** — always display intent, design, tasks, and context (if present). Don't skip sections.
|
|
136
|
+
- **Always show the Decision Bridge delta** — when both plan and context exist, the delta table is mandatory.
|
|
137
|
+
- **Handle missing files gracefully** — show what's available, note what's missing, suggest the command to fill gaps.
|
|
138
|
+
- **Read-only** — this skill displays information. It never modifies files.
|
|
139
|
+
- **Show completion status** — for tasks, show N/M complete. For the Decision Bridge, show resolved vs pending counts.
|
|
140
|
+
- **Tell the story, don't dump data** — organize output as a narrative arc, not a raw file concatenation.
|
|
@@ -1,154 +0,0 @@
|
|
|
1
|
-
---
|
|
2
|
-
name: whyspec-capture
|
|
3
|
-
description: Capture reasoning after implementation — resolve the Decision Bridge by mapping planned decisions to actual outcomes and recording surprises. Use after coding to preserve the WHY behind what was built.
|
|
4
|
-
---
|
|
5
|
-
|
|
6
|
-
Capture reasoning — create a context file that resolves the Decision Bridge and preserves the full story.
|
|
7
|
-
|
|
8
|
-
View the complete story with `/whyspec-show`
|
|
9
|
-
|
|
10
|
-
---
|
|
11
|
-
|
|
12
|
-
**Input**: Optionally specify a change name. If omitted, auto-detect the most recently executed change.
|
|
13
|
-
|
|
14
|
-
**Steps**
|
|
15
|
-
|
|
16
|
-
1. **Select the change**
|
|
17
|
-
|
|
18
|
-
If a name is provided, use it. Otherwise:
|
|
19
|
-
- Auto-detect the most recently executed change (look for changes with completed tasks)
|
|
20
|
-
- If ambiguous, run `whyspec list --json` and use **AskUserQuestion** to select
|
|
21
|
-
|
|
22
|
-
2. **Read plan files for Decision Bridge mapping**
|
|
23
|
-
|
|
24
|
-
Read these files from the change folder — **required** before generating context:
|
|
25
|
-
- `<path>/intent.md` — the stated intent, "Decisions to Make" checkboxes
|
|
26
|
-
- `<path>/design.md` — the approach, "Questions to Resolve" items
|
|
27
|
-
|
|
28
|
-
Extract and track:
|
|
29
|
-
- Every `- [ ]` or `- [x]` item under "Decisions to Make" → each MUST be resolved in the context
|
|
30
|
-
- Every item under "Questions to Resolve" → each MUST be answered
|
|
31
|
-
- The stated constraints and success criteria → compare against what actually happened
|
|
32
|
-
|
|
33
|
-
3. **Get capture data from CLI**
|
|
34
|
-
|
|
35
|
-
```bash
|
|
36
|
-
whyspec capture --json "<name>"
|
|
37
|
-
```
|
|
38
|
-
|
|
39
|
-
Parse the JSON response:
|
|
40
|
-
- `template`: Context file template
|
|
41
|
-
- `commits`: Commits associated with this change (auto-detected from git)
|
|
42
|
-
- `files_changed`: Files modified during implementation (auto-detected)
|
|
43
|
-
- `decisions_to_make`: Decision checkboxes extracted from plan files
|
|
44
|
-
- `change_name`: The change name for the header
|
|
45
|
-
|
|
46
|
-
4. **Populate the Decision Bridge**
|
|
47
|
-
|
|
48
|
-
This is the core of the capture. Map every planned decision to its outcome:
|
|
49
|
-
|
|
50
|
-
a. **Decisions to Make → Decisions Made**: For EACH checkbox from intent.md's "Decisions to Make", record:
|
|
51
|
-
- What was decided
|
|
52
|
-
- Why (the rationale — not just the choice, but the reasoning)
|
|
53
|
-
- Any constraints that influenced the decision
|
|
54
|
-
|
|
55
|
-
b. **Questions to Resolve → Answers**: For EACH question from design.md's "Questions to Resolve", record:
|
|
56
|
-
- The answer that emerged during implementation
|
|
57
|
-
- How it was determined
|
|
58
|
-
|
|
59
|
-
c. **Capture Surprises**: Identify decisions made during implementation that were NOT in the original plan. Ask yourself:
|
|
60
|
-
- "What did we decide that we didn't plan to decide?"
|
|
61
|
-
- "What changed from the original design?"
|
|
62
|
-
- "What unexpected requirements emerged?"
|
|
63
|
-
These surprises are often the most valuable part of the context.
|
|
64
|
-
|
|
65
|
-
If a planned decision was NOT made during implementation, note it as unresolved and ask the user.
|
|
66
|
-
|
|
67
|
-
5. **Generate ctx_<id>.md in SaaS XML format**
|
|
68
|
-
|
|
69
|
-
Write to `<path>/ctx_<id>.md` using the GitWhy SaaS format:
|
|
70
|
-
|
|
71
|
-
```xml
|
|
72
|
-
<context>
|
|
73
|
-
<title>Short title describing what was built and why</title>
|
|
74
|
-
|
|
75
|
-
<story>
|
|
76
|
-
Phase-organized engineering journal. First-person, chronological.
|
|
77
|
-
Capture the FULL reasoning — not a summary.
|
|
78
|
-
|
|
79
|
-
Phase 1 — [Setup/Context]:
|
|
80
|
-
What the user asked for, initial understanding, preparation work.
|
|
81
|
-
|
|
82
|
-
Phase 2 — [Implementation]:
|
|
83
|
-
What was built, key decision points encountered, problems solved.
|
|
84
|
-
Reference specific files and approaches.
|
|
85
|
-
|
|
86
|
-
Phase 3 — [Verification]:
|
|
87
|
-
How the work was verified, test results, manual checks.
|
|
88
|
-
</story>
|
|
89
|
-
|
|
90
|
-
<reasoning>
|
|
91
|
-
Why this approach was chosen over alternatives.
|
|
92
|
-
|
|
93
|
-
<decisions>
|
|
94
|
-
- [Planned decision] — [chosen option] — [rationale]
|
|
95
|
-
- [Planned decision] — [chosen option] — [rationale]
|
|
96
|
-
</decisions>
|
|
97
|
-
|
|
98
|
-
<rejected>
|
|
99
|
-
- [Alternative not chosen] — [why it was rejected]
|
|
100
|
-
</rejected>
|
|
101
|
-
|
|
102
|
-
<tradeoffs>
|
|
103
|
-
- [Trade-off accepted] — [what was gained vs lost]
|
|
104
|
-
</tradeoffs>
|
|
105
|
-
</reasoning>
|
|
106
|
-
|
|
107
|
-
<files>
|
|
108
|
-
path/to/file.ts — new — Brief description
|
|
109
|
-
path/to/other.ts — modified — Brief description
|
|
110
|
-
</files>
|
|
111
|
-
|
|
112
|
-
<agent>claude-code (model-name)</agent>
|
|
113
|
-
<tags>comma, separated, domain, keywords</tags>
|
|
114
|
-
<verification>Test results and build status</verification>
|
|
115
|
-
<risks>Open questions, follow-up items, known limitations</risks>
|
|
116
|
-
</context>
|
|
117
|
-
```
|
|
118
|
-
|
|
119
|
-
**Surprises** go in the `<story>` narrative AND as a clearly labeled section in `<reasoning>`:
|
|
120
|
-
|
|
121
|
-
```
|
|
122
|
-
Surprises (decisions not in the original plan):
|
|
123
|
-
- [Unexpected decision] — [why it was needed]
|
|
124
|
-
- [Scope change] — [what triggered it]
|
|
125
|
-
```
|
|
126
|
-
|
|
127
|
-
6. **Show summary**
|
|
128
|
-
|
|
129
|
-
```
|
|
130
|
-
## Reasoning Captured: <name>
|
|
131
|
-
|
|
132
|
-
Context: ctx_<id>.md
|
|
133
|
-
|
|
134
|
-
Decision Bridge:
|
|
135
|
-
Planned decisions resolved: N/N
|
|
136
|
-
Questions answered: N/N
|
|
137
|
-
Surprises captured: N
|
|
138
|
-
|
|
139
|
-
Files documented: N
|
|
140
|
-
Commits linked: N
|
|
141
|
-
|
|
142
|
-
View the full story: /whyspec-show <name>
|
|
143
|
-
```
|
|
144
|
-
|
|
145
|
-
**Guardrails**
|
|
146
|
-
|
|
147
|
-
- **Must read plan files FIRST** — never generate context without reading intent.md and design.md. The Decision Bridge requires mapping FROM plan TO outcome.
|
|
148
|
-
- **Every planned decision must be resolved** — if intent.md lists 5 "Decisions to Make", all 5 must appear in the context. Prompt the user for any that weren't addressed.
|
|
149
|
-
- **Never skip surprises** — unplanned decisions are the most valuable context. Actively search for them.
|
|
150
|
-
- **Capture reasoning, not summaries** — write "we chose X because Y outweighs Z in our constraints" not just "we used X." Full reasoning helps future developers understand the choice.
|
|
151
|
-
- **Use SaaS XML format exactly** — the `<context>` tags must match the GitWhy format so `git why log` and `git why push` work without conversion.
|
|
152
|
-
- **Include verification results** — what tests pass, what was manually verified. This grounds the context in evidence.
|
|
153
|
-
- **Don't fabricate rationale** — if you don't know why a decision was made, ask the user. Invented reasoning is worse than no reasoning.
|
|
154
|
-
- **One context per capture** — each `/whyspec-capture` invocation creates exactly one `ctx_<id>.md` file.
|
|
@@ -1,404 +0,0 @@
|
|
|
1
|
-
---
|
|
2
|
-
name: whyspec-debug
|
|
3
|
-
description: Debug with scientific method — gather symptoms, form falsifiable hypotheses, test systematically, verify root cause before fixing. Searches team knowledge first and captures the full investigation as persistent context.
|
|
4
|
-
---
|
|
5
|
-
|
|
6
|
-
# WhySpec Debug — Scientific Investigation
|
|
7
|
-
|
|
8
|
-
Debug systematically. No fix without root cause.
|
|
9
|
-
|
|
10
|
-
This skill implements a structured debugging process that captures the full investigation
|
|
11
|
-
as persistent context — symptoms, hypotheses, evidence, root cause, and fix rationale.
|
|
12
|
-
|
|
13
|
-
The investigation is automatically saved as a context file when resolved.
|
|
14
|
-
|
|
15
|
-
---
|
|
16
|
-
|
|
17
|
-
## Purpose
|
|
18
|
-
|
|
19
|
-
Debugging is not guessing. This skill enforces:
|
|
20
|
-
|
|
21
|
-
1. **Team knowledge first** — search past reasoning before reinventing
|
|
22
|
-
2. **Scientific method** — falsifiable hypotheses tested with evidence
|
|
23
|
-
3. **Iron Law** — no fix is proposed until root cause is verified
|
|
24
|
-
4. **Persistent state** — debug.md survives context resets so investigations can resume
|
|
25
|
-
5. **Reasoning capture** — every investigation produces a context file for future developers
|
|
26
|
-
|
|
27
|
-
---
|
|
28
|
-
|
|
29
|
-
**Input**: A bug description, error message, or change name for an existing debug session.
|
|
30
|
-
|
|
31
|
-
---
|
|
32
|
-
|
|
33
|
-
## Step 0: Team Knowledge Search
|
|
34
|
-
|
|
35
|
-
Before investigating, check if someone has reasoned about this domain before:
|
|
36
|
-
|
|
37
|
-
```bash
|
|
38
|
-
whyspec search --json "<keywords from bug description>"
|
|
39
|
-
```
|
|
40
|
-
|
|
41
|
-
If results exist:
|
|
42
|
-
- Display: "Found N past contexts in this domain"
|
|
43
|
-
- List relevant titles and key decisions from past investigations
|
|
44
|
-
- Note any past decisions that might inform the current bug
|
|
45
|
-
|
|
46
|
-
If no results: note "No prior context found" and continue.
|
|
47
|
-
|
|
48
|
-
This step takes seconds. It prevents re-investigating solved problems and surfaces past decisions that may explain the current behavior.
|
|
49
|
-
|
|
50
|
-
---
|
|
51
|
-
|
|
52
|
-
## Step 1: Symptoms Gathering
|
|
53
|
-
|
|
54
|
-
Create the debug session:
|
|
55
|
-
|
|
56
|
-
```bash
|
|
57
|
-
whyspec debug --json "<bug-name>"
|
|
58
|
-
```
|
|
59
|
-
|
|
60
|
-
Parse the JSON response:
|
|
61
|
-
- `path`: Debug session directory (e.g., `.gitwhy/changes/<bug-name>/`)
|
|
62
|
-
- `template`: debug.md template structure
|
|
63
|
-
- `related_contexts`: Past contexts in the same domain (from Step 0)
|
|
64
|
-
|
|
65
|
-
**Gather symptoms** — use **AskUserQuestion** if the user hasn't provided enough detail, or investigate the codebase directly:
|
|
66
|
-
|
|
67
|
-
| Symptom | What to capture |
|
|
68
|
-
|---------|----------------|
|
|
69
|
-
| Expected behavior | What SHOULD happen |
|
|
70
|
-
| Actual behavior | What ACTUALLY happens |
|
|
71
|
-
| Error messages | Exact text, stack traces, error codes |
|
|
72
|
-
| Reproduction steps | Minimal sequence to trigger the bug |
|
|
73
|
-
| Timeline | When it started, what changed recently |
|
|
74
|
-
| Scope | Who is affected, how often, which environments |
|
|
75
|
-
|
|
76
|
-
**Write debug.md immediately** — this file IS the investigation state:
|
|
77
|
-
|
|
78
|
-
```markdown
|
|
79
|
-
# Debug: <bug-name>
|
|
80
|
-
|
|
81
|
-
## Status: INVESTIGATING
|
|
82
|
-
|
|
83
|
-
## Symptoms
|
|
84
|
-
|
|
85
|
-
**Expected:** [what should happen]
|
|
86
|
-
**Actual:** [what actually happens]
|
|
87
|
-
**Error:**
|
|
88
|
-
```
|
|
89
|
-
[exact error message or stack trace]
|
|
90
|
-
```
|
|
91
|
-
**Reproduction:** [minimal steps to reproduce]
|
|
92
|
-
**Timeline:** [when it started, what changed recently]
|
|
93
|
-
**Scope:** [who/what is affected, frequency]
|
|
94
|
-
|
|
95
|
-
## Related Past Contexts
|
|
96
|
-
|
|
97
|
-
[Results from Step 0, or "None found"]
|
|
98
|
-
[If found: relevant decisions, reasoning excerpts]
|
|
99
|
-
|
|
100
|
-
## Hypotheses
|
|
101
|
-
|
|
102
|
-
[Populated in Step 2]
|
|
103
|
-
|
|
104
|
-
## Evidence Log
|
|
105
|
-
|
|
106
|
-
[Populated in Step 3]
|
|
107
|
-
|
|
108
|
-
## Root Cause
|
|
109
|
-
|
|
110
|
-
[Populated in Step 4]
|
|
111
|
-
|
|
112
|
-
## Fix
|
|
113
|
-
|
|
114
|
-
[Populated in Step 5]
|
|
115
|
-
|
|
116
|
-
## Prevention
|
|
117
|
-
|
|
118
|
-
[Populated in Step 5]
|
|
119
|
-
```
|
|
120
|
-
|
|
121
|
-
**CRITICAL**: Write debug.md to `<path>/debug.md` NOW, after this step. It persists across context resets and enables resuming the investigation.
|
|
122
|
-
|
|
123
|
-
---
|
|
124
|
-
|
|
125
|
-
## Step 2: Hypothesis Formation
|
|
126
|
-
|
|
127
|
-
Form **3 or more falsifiable hypotheses**. Each must include a specific claim, a concrete test, and a way to disprove it:
|
|
128
|
-
|
|
129
|
-
```markdown
|
|
130
|
-
## Hypotheses
|
|
131
|
-
|
|
132
|
-
### H1: [Specific, testable claim about the root cause]
|
|
133
|
-
- **Test:** [Concrete action — a command to run, a log to check, a condition to verify]
|
|
134
|
-
- **Disproof:** [What evidence would prove this hypothesis WRONG]
|
|
135
|
-
- **Status:** UNTESTED
|
|
136
|
-
- **Likelihood:** HIGH / MEDIUM / LOW
|
|
137
|
-
|
|
138
|
-
### H2: [Different claim — consider a different subsystem or mechanism]
|
|
139
|
-
- **Test:** [Concrete action]
|
|
140
|
-
- **Disproof:** [What would disprove it]
|
|
141
|
-
- **Status:** UNTESTED
|
|
142
|
-
- **Likelihood:** HIGH / MEDIUM / LOW
|
|
143
|
-
|
|
144
|
-
### H3: [Third claim — consider edge cases, race conditions, configuration]
|
|
145
|
-
- **Test:** [Concrete action]
|
|
146
|
-
- **Disproof:** [What would disprove it]
|
|
147
|
-
- **Status:** UNTESTED
|
|
148
|
-
- **Likelihood:** HIGH / MEDIUM / LOW
|
|
149
|
-
```
|
|
150
|
-
|
|
151
|
-
**Hypothesis quality rules:**
|
|
152
|
-
- Each hypothesis must be **specific enough to test** — "something is wrong with auth" is not a hypothesis
|
|
153
|
-
- Each hypothesis must be **falsifiable** — there must be evidence that could prove it wrong
|
|
154
|
-
- Hypotheses should target **different root causes** — not three variations of the same idea
|
|
155
|
-
- **Use past contexts**: if Step 0 found related reasoning, let those decisions inform your hypotheses. A past choice ("we used X because of Y") might explain the current behavior.
|
|
156
|
-
|
|
157
|
-
Rank by likelihood. Test the most likely first.
|
|
158
|
-
|
|
159
|
-
Update debug.md with all hypotheses before proceeding.
|
|
160
|
-
|
|
161
|
-
---
|
|
162
|
-
|
|
163
|
-
## Step 3: Hypothesis Testing
|
|
164
|
-
|
|
165
|
-
Test each hypothesis **one at a time, sequentially**. For each:
|
|
166
|
-
|
|
167
|
-
1. **Execute the test** described in the hypothesis
|
|
168
|
-
2. **Record evidence** — exact output, logs, observed behavior
|
|
169
|
-
3. **Evaluate** — does the evidence support, refute, or leave the hypothesis inconclusive?
|
|
170
|
-
4. **Update status**: `CONFIRMED`, `DISPROVED`, or `INCONCLUSIVE`
|
|
171
|
-
5. **Update debug.md immediately** with findings
|
|
172
|
-
|
|
173
|
-
```markdown
|
|
174
|
-
## Evidence Log
|
|
175
|
-
|
|
176
|
-
### H1: [claim] — DISPROVED
|
|
177
|
-
**Test performed:** [exact command or action taken]
|
|
178
|
-
**Evidence:**
|
|
179
|
-
```
|
|
180
|
-
[exact output, log entries, or observations]
|
|
181
|
-
```
|
|
182
|
-
**Conclusion:** [why this hypothesis is disproved — what the evidence shows]
|
|
183
|
-
|
|
184
|
-
### H2: [claim] — CONFIRMED
|
|
185
|
-
**Test performed:** [exact command or action taken]
|
|
186
|
-
**Evidence:**
|
|
187
|
-
```
|
|
188
|
-
[exact output showing the root cause]
|
|
189
|
-
```
|
|
190
|
-
**Conclusion:** [why this is confirmed — the causal link between evidence and symptom]
|
|
191
|
-
```
|
|
192
|
-
|
|
193
|
-
**Testing rules:**
|
|
194
|
-
|
|
195
|
-
- **One hypothesis at a time** — never test multiple simultaneously. Confounded evidence is useless.
|
|
196
|
-
- **Max 3 tests per hypothesis** — if evidence is inconclusive after 3 attempts, mark INCONCLUSIVE and move to the next.
|
|
197
|
-
- **Preserve the crime scene** — before modifying suspect code, record its current state in the evidence log.
|
|
198
|
-
- **Update debug.md after each test** — don't batch. Each test result is written immediately.
|
|
199
|
-
|
|
200
|
-
If ALL hypotheses are disproved or inconclusive:
|
|
201
|
-
- Form new hypotheses based on what the evidence revealed
|
|
202
|
-
- If still stuck after a second round, escalate to the user (see Escalation Rules)
|
|
203
|
-
|
|
204
|
-
---
|
|
205
|
-
|
|
206
|
-
## Step 4: Root Cause Verification — The Iron Law
|
|
207
|
-
|
|
208
|
-
**No fix without verified root cause.**
|
|
209
|
-
|
|
210
|
-
Before proposing ANY fix, you must:
|
|
211
|
-
|
|
212
|
-
1. **State the root cause clearly and specifically**
|
|
213
|
-
2. **Explain the causal chain**: [trigger] → [mechanism] → [symptom]
|
|
214
|
-
3. **Verify predictive power**: can you predict the symptom from the cause? Can you reliably reproduce it?
|
|
215
|
-
|
|
216
|
-
Update debug.md:
|
|
217
|
-
|
|
218
|
-
```markdown
|
|
219
|
-
## Root Cause
|
|
220
|
-
|
|
221
|
-
**Cause:** [precise description of what is wrong — not symptoms, the actual defect]
|
|
222
|
-
**Causal chain:** [trigger event] → [mechanism/code path] → [observed symptom]
|
|
223
|
-
**Verified by:** [how the causal link was confirmed — which test, which evidence]
|
|
224
|
-
**Confidence:** HIGH / MEDIUM / LOW
|
|
225
|
-
```
|
|
226
|
-
|
|
227
|
-
**Confidence thresholds:**
|
|
228
|
-
|
|
229
|
-
| Confidence | Criteria | Action |
|
|
230
|
-
|-----------|----------|--------|
|
|
231
|
-
| HIGH | Reproduction is reliable, causal chain is clear, evidence is unambiguous | Proceed to fix |
|
|
232
|
-
| MEDIUM | Strong evidence but some uncertainty remains | Proceed with caution, note risks |
|
|
233
|
-
| LOW | Circumstantial evidence, cannot reliably reproduce | **Escalate to user** — do NOT fix |
|
|
234
|
-
|
|
235
|
-
If confidence is LOW:
|
|
236
|
-
- Present all evidence gathered to the user via **AskUserQuestion**
|
|
237
|
-
- Show: what was tested, what was found, what remains uncertain
|
|
238
|
-
- Ask for additional context, access, or direction
|
|
239
|
-
- **Do NOT guess at a fix**
|
|
240
|
-
|
|
241
|
-
Update debug.md status: `## Status: ROOT CAUSE IDENTIFIED`
|
|
242
|
-
|
|
243
|
-
---
|
|
244
|
-
|
|
245
|
-
## Step 5: Fix + Auto-Capture
|
|
246
|
-
|
|
247
|
-
Once root cause is verified with HIGH or MEDIUM confidence:
|
|
248
|
-
|
|
249
|
-
### 5a. Implement the fix
|
|
250
|
-
|
|
251
|
-
- Make the **minimal, targeted change** that addresses the root cause
|
|
252
|
-
- Don't refactor surrounding code — fix the bug, nothing more
|
|
253
|
-
- Verify the fix resolves the symptom (run the reproduction steps again)
|
|
254
|
-
|
|
255
|
-
### 5b. Update debug.md
|
|
256
|
-
|
|
257
|
-
```markdown
|
|
258
|
-
## Fix
|
|
259
|
-
|
|
260
|
-
**Change:** [what was modified and how]
|
|
261
|
-
**Files:** [files changed]
|
|
262
|
-
**Verification:** [how the fix was confirmed — test results, manual reproduction]
|
|
263
|
-
|
|
264
|
-
## Prevention
|
|
265
|
-
|
|
266
|
-
**How to prevent recurrence:**
|
|
267
|
-
- [Concrete preventive measure — e.g., "add input validation for X"]
|
|
268
|
-
- [Process improvement — e.g., "add test case for this edge case"]
|
|
269
|
-
- [Monitoring — e.g., "add alert for this error pattern"]
|
|
270
|
-
```
|
|
271
|
-
|
|
272
|
-
Update debug.md status: `## Status: RESOLVED`
|
|
273
|
-
|
|
274
|
-
### 5c. Commit the fix
|
|
275
|
-
|
|
276
|
-
Commit atomically with a clear message referencing the root cause.
|
|
277
|
-
|
|
278
|
-
### 5d. Auto-capture reasoning
|
|
279
|
-
|
|
280
|
-
Generate a context file to preserve the full investigation:
|
|
281
|
-
|
|
282
|
-
```bash
|
|
283
|
-
whyspec capture --json "<bug-name>"
|
|
284
|
-
```
|
|
285
|
-
|
|
286
|
-
Write `<path>/ctx_<id>.md` in SaaS XML format:
|
|
287
|
-
|
|
288
|
-
```xml
|
|
289
|
-
<context>
|
|
290
|
-
<title>Debug: [short description — bug and fix]</title>
|
|
291
|
-
|
|
292
|
-
<story>
|
|
293
|
-
Phase 1 — Symptoms:
|
|
294
|
-
[What was observed, when it started, reproduction steps]
|
|
295
|
-
|
|
296
|
-
Phase 2 — Investigation:
|
|
297
|
-
[Hypotheses formed, tests performed, evidence gathered]
|
|
298
|
-
[Which hypotheses were disproved and why]
|
|
299
|
-
|
|
300
|
-
Phase 3 — Root Cause:
|
|
301
|
-
[The actual defect, causal chain, how it was verified]
|
|
302
|
-
|
|
303
|
-
Phase 4 — Fix:
|
|
304
|
-
[What was changed, how the fix was confirmed]
|
|
305
|
-
</story>
|
|
306
|
-
|
|
307
|
-
<reasoning>
|
|
308
|
-
Why the bug existed and why this fix is correct.
|
|
309
|
-
|
|
310
|
-
<decisions>
|
|
311
|
-
- [Fix approach chosen] — [rationale for this approach]
|
|
312
|
-
</decisions>
|
|
313
|
-
|
|
314
|
-
<rejected>
|
|
315
|
-
- [Alternative fix considered] — [why it was rejected]
|
|
316
|
-
- [Disproved hypothesis] — [what evidence ruled it out]
|
|
317
|
-
</rejected>
|
|
318
|
-
|
|
319
|
-
<tradeoffs>
|
|
320
|
-
- [Any trade-offs in the fix — scope, performance, complexity]
|
|
321
|
-
</tradeoffs>
|
|
322
|
-
</reasoning>
|
|
323
|
-
|
|
324
|
-
<files>
|
|
325
|
-
[Files changed to fix the bug]
|
|
326
|
-
</files>
|
|
327
|
-
|
|
328
|
-
<verification>[Test results confirming the fix]</verification>
|
|
329
|
-
<risks>[Potential side effects, related areas to watch]</risks>
|
|
330
|
-
</context>
|
|
331
|
-
```
|
|
332
|
-
|
|
333
|
-
### 5e. Show summary
|
|
334
|
-
|
|
335
|
-
```
|
|
336
|
-
## Debug Complete: <bug-name>
|
|
337
|
-
|
|
338
|
-
Root cause: [one-line summary]
|
|
339
|
-
Fix: [what was changed]
|
|
340
|
-
Context: ctx_<id>.md
|
|
341
|
-
|
|
342
|
-
Investigation:
|
|
343
|
-
Hypotheses tested: N (M confirmed, P disproved)
|
|
344
|
-
Evidence entries: N
|
|
345
|
-
Past contexts referenced: N
|
|
346
|
-
|
|
347
|
-
View full investigation: /whyspec-show <bug-name>
|
|
348
|
-
```
|
|
349
|
-
|
|
350
|
-
---
|
|
351
|
-
|
|
352
|
-
## Resuming an Investigation
|
|
353
|
-
|
|
354
|
-
If the user invokes `/whyspec-debug` and a `debug.md` already exists for that change:
|
|
355
|
-
|
|
356
|
-
1. **Read debug.md** from the change folder
|
|
357
|
-
2. **Check the Status field** and resume from the appropriate step:
|
|
358
|
-
|
|
359
|
-
| Status | Resume from |
|
|
360
|
-
|--------|-------------|
|
|
361
|
-
| `INVESTIGATING` | Last completed step — check which sections are populated |
|
|
362
|
-
| `ROOT CAUSE IDENTIFIED` | Step 5 — implement the fix |
|
|
363
|
-
| `RESOLVED` | Investigation is complete — show summary |
|
|
364
|
-
|
|
365
|
-
3. **Announce**: "Resuming debug session: <name> — Status: <status>"
|
|
366
|
-
4. **Show progress**: display completed sections and what remains
|
|
367
|
-
|
|
368
|
-
This is why writing debug.md incrementally is critical — it's the contract for resumability.
|
|
369
|
-
|
|
370
|
-
---
|
|
371
|
-
|
|
372
|
-
## Escalation Rules
|
|
373
|
-
|
|
374
|
-
Escalate to the user (via **AskUserQuestion**) when:
|
|
375
|
-
|
|
376
|
-
| Trigger | What to present |
|
|
377
|
-
|---------|----------------|
|
|
378
|
-
| All hypotheses disproved (2 rounds) | Full evidence summary, ask for new direction |
|
|
379
|
-
| Cannot reproduce | Symptoms documented, ask for environment details or access |
|
|
380
|
-
| Root cause outside codebase | Findings documented, suggest infrastructure/environment investigation |
|
|
381
|
-
| Root cause confidence is LOW | Evidence summary, explain uncertainty, ask for guidance |
|
|
382
|
-
| Fix would introduce significant risk | Proposed fix, risk assessment, ask for approval |
|
|
383
|
-
|
|
384
|
-
When escalating, always present:
|
|
385
|
-
- What was tested and what was found
|
|
386
|
-
- What remains uncertain
|
|
387
|
-
- A specific question or request for the user
|
|
388
|
-
|
|
389
|
-
**Never silently give up.** If you're stuck, say so with evidence.
|
|
390
|
-
|
|
391
|
-
---
|
|
392
|
-
|
|
393
|
-
## Guardrails
|
|
394
|
-
|
|
395
|
-
- **No fix without root cause** — the Iron Law is non-negotiable. Never propose a fix based on a guess, a hunch, or pattern-matching without evidence.
|
|
396
|
-
- **Max 3 tests per hypothesis** — if evidence is inconclusive after 3 attempts, mark INCONCLUSIVE and form new hypotheses or escalate.
|
|
397
|
-
- **Always capture reasoning** — every debug session MUST produce both `debug.md` AND `ctx_<id>.md`. No silent fixes. The investigation is as valuable as the fix.
|
|
398
|
-
- **Write debug.md incrementally** — update after EVERY step, not at the end. This is the resumability contract. If context resets, the investigation survives.
|
|
399
|
-
- **Don't skip team knowledge** — always run Step 0, even for "obvious" bugs. Past contexts prevent repeated mistakes and surface relevant decisions.
|
|
400
|
-
- **Don't guess at root cause** — if uncertain after investigation, escalate. Wrong diagnosis leads to wrong fixes that mask the real problem.
|
|
401
|
-
- **Test one hypothesis at a time** — never test multiple simultaneously. Sequential testing produces clean evidence.
|
|
402
|
-
- **Preserve evidence** — before modifying suspect code, record its current state. Don't destroy the crime scene.
|
|
403
|
-
- **Minimal fixes only** — fix the bug, don't refactor. Keep the diff focused on the root cause.
|
|
404
|
-
- **Don't skip prevention** — after fixing, always document how to prevent recurrence. Future developers need this.
|