@haposoft/cafekit 0.8.1 → 0.8.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +2 -0
- package/package.json +5 -2
- package/src/claude/CLAUDE.md +1 -0
- package/src/claude/agents/debugger.md +58 -4
- package/src/claude/agents/docs-keeper.md +1 -1
- package/src/claude/agents/god-developer.md +2 -2
- package/src/claude/agents/project-manager.md +1 -1
- package/src/claude/agents/spec-maker.md +22 -19
- package/src/claude/agents/test-runner.md +1 -0
- package/src/claude/agents/ui-ux-designer.md +3 -3
- package/src/claude/migration-manifest.json +1 -0
- package/src/claude/references/debugger/condition-based-waiting.md +56 -0
- package/src/claude/references/debugger/frontend-verification.md +59 -0
- package/src/claude/references/debugger/performance-diagnostics.md +76 -0
- package/src/claude/references/debugger/side-effect-gate.md +48 -0
- package/src/claude/rules/manage-docs.md +2 -2
- package/src/claude/settings/settings.json +1 -1
- package/src/claude/skills/ai-multimodal/SKILL.md +1 -1
- package/src/claude/skills/brainstorm/SKILL.md +2 -2
- package/src/claude/skills/chrome-devtools/SKILL.md +1 -1
- package/src/claude/skills/code-review/SKILL.md +1 -1
- package/src/claude/skills/debug/SKILL.md +216 -0
- package/src/claude/skills/develop/SKILL.md +1 -1
- package/src/claude/skills/develop/references/quality-gate.md +3 -3
- package/src/claude/skills/develop/references/subagent-patterns.md +10 -10
- package/src/claude/skills/frontend-design/SKILL.md +1 -1
- package/src/claude/skills/hotfix/SKILL.md +30 -10
- package/src/claude/skills/hotfix/references/diagnosis-protocol.md +28 -4
- package/src/claude/skills/hotfix/references/parallel-patterns.md +13 -13
- package/src/claude/skills/hotfix/references/prevention-gate.md +8 -1
- package/src/claude/skills/hotfix/references/workflow-specialized.md +3 -1
- package/src/claude/skills/inspect/SKILL.md +2 -2
- package/src/claude/skills/inspect/references/external-gemini-inspection.md +11 -11
- package/src/claude/skills/research/SKILL.md +1 -1
- package/src/claude/skills/specs/SKILL.md +6 -6
- package/src/claude/skills/specs/references/codebase-analysis.md +1 -1
- package/src/claude/skills/test/SKILL.md +1 -1
- package/src/claude/skills/ai-multimodal/scripts/.coverage +0 -0
- package/src/claude/skills/ai-multimodal/scripts/tests/.coverage +0 -0
- package/src/claude/skills/pdf/scripts/__pycache__/check_bounding_boxes.cpython-314.pyc +0 -0
|
@@ -291,7 +291,7 @@ node $SKILL_DIR/packages/spec/src/claude/chrome-devtools/tmp/login-test.js
|
|
|
291
291
|
|
|
292
292
|
Skills can exist in **project-scope** or **user-scope**. Priority: project-scope > user-scope.
|
|
293
293
|
|
|
294
|
-
**IMPORTANT:**
|
|
294
|
+
**IMPORTANT:** Store browser artifacts in a project-local screenshots or reports folder and include exact output paths in the final report.
|
|
295
295
|
|
|
296
296
|
Store screenshots for analysis in `<project>/packages/spec/src/claude/chrome-devtools/screenshots/`:
|
|
297
297
|
|
|
@@ -51,7 +51,7 @@ Does the code match what was requested?
|
|
|
51
51
|
|
|
52
52
|
### Stage 3 — Adversarial Review (Red-Team)
|
|
53
53
|
Actively try to break the code.
|
|
54
|
-
- **Edge Case Scouting:** If the Pull Request modifies >= 5 files,
|
|
54
|
+
- **Edge Case Scouting:** If the Pull Request modifies >= 5 files, activate `hapo:inspect` or call the `inspector` agent to scout where modified functions/components are imported and whether boundary errors exist before finishing the review.
|
|
55
55
|
- Find security holes (XSS, SQL Injection, Hardcoded tokens, Exposed Secrets).
|
|
56
56
|
- Find false assumptions, resource exhaustion loops, and race conditions.
|
|
57
57
|
- Find unhandled edge cases (e.g. empty strings, null pointers, negative integers).
|
|
@@ -0,0 +1,216 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: hapo:debug
|
|
3
|
+
description: "Use before fixing any bug, failing test, CI/CD failure, production incident, performance issue, UI regression, flaky test, or unexpected behavior. Diagnostic-only root-cause workflow with evidence, hypotheses, blast-radius mapping, and verification plan."
|
|
4
|
+
argument-hint: "[issue] --quick|--ci|--frontend|--perf"
|
|
5
|
+
version: "1.0.0"
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
# Debug - Evidence-First Root Cause Analysis
|
|
9
|
+
|
|
10
|
+
Debugging is diagnosis, not repair. Find the source of the failure before changing product code.
|
|
11
|
+
|
|
12
|
+
## Arguments
|
|
13
|
+
|
|
14
|
+
- `--quick` - Abbreviated path for syntax, lint, type, or single-test failures with obvious local scope
|
|
15
|
+
- `--ci` - Focus on CI/CD logs, runner environment, dependency versions, and pipeline setup
|
|
16
|
+
- `--frontend` - Include browser console, screenshot, accessibility tree, network, and responsive checks
|
|
17
|
+
- `--perf` - Include baseline measurements, bottleneck layer, profiling, and before/after targets
|
|
18
|
+
|
|
19
|
+
Default: systematic diagnosis with no product-code edits.
|
|
20
|
+
|
|
21
|
+
<HARD-GATE>
|
|
22
|
+
Do NOT implement a fix inside `hapo:debug`.
|
|
23
|
+
Do NOT recommend a fix until the root-cause contract is complete.
|
|
24
|
+
Do NOT stop at the first plausible explanation. Test hypotheses against evidence.
|
|
25
|
+
If 2+ hypotheses are refuted, change strategy before continuing.
|
|
26
|
+
If evidence is insufficient, report `Root cause: unknown` with the missing evidence needed.
|
|
27
|
+
</HARD-GATE>
|
|
28
|
+
|
|
29
|
+
Temporary instrumentation is allowed only when it is the minimal way to observe hidden state. Remove it before finishing and report what was instrumented.
|
|
30
|
+
|
|
31
|
+
## Process Flow
|
|
32
|
+
|
|
33
|
+
```mermaid
|
|
34
|
+
flowchart TD
|
|
35
|
+
A[Issue Input] --> B[Step 1: Scout via hapo:inspect]
|
|
36
|
+
B --> C[Step 2: Capture Evidence]
|
|
37
|
+
C --> D[Step 3: Pattern Analysis]
|
|
38
|
+
D --> E[Step 4: Hypothesis Tests]
|
|
39
|
+
E --> F[Step 5: Root Cause Trace]
|
|
40
|
+
F --> G[Step 6: Blast Radius + Verification Plan]
|
|
41
|
+
G --> H[Diagnostic Report]
|
|
42
|
+
H --> I{Fix requested?}
|
|
43
|
+
I -->|Yes| J[Hand off to hapo:hotfix]
|
|
44
|
+
I -->|No| K[Stop after diagnosis]
|
|
45
|
+
```
|
|
46
|
+
|
|
47
|
+
**This diagram is the authoritative workflow.** `hapo:debug` stops at diagnosis unless the user explicitly asks to fix.
|
|
48
|
+
|
|
49
|
+
---
|
|
50
|
+
|
|
51
|
+
## Step 1: Scout
|
|
52
|
+
|
|
53
|
+
Understand the affected code before forming hypotheses.
|
|
54
|
+
|
|
55
|
+
**Action:** Activate `hapo:inspect` for the relevant scope.
|
|
56
|
+
|
|
57
|
+
**Checklist:**
|
|
58
|
+
- [ ] Affected files and modules identified
|
|
59
|
+
- [ ] Direct dependencies and call paths mapped
|
|
60
|
+
- [ ] Related tests located
|
|
61
|
+
- [ ] Recent changes checked: `git log --oneline -10 -- <affected-files>`
|
|
62
|
+
- [ ] Existing working examples or adjacent patterns identified
|
|
63
|
+
|
|
64
|
+
**Output:** `✓ Step 1: Scouted - [N] files, [M] deps, [K] tests`
|
|
65
|
+
|
|
66
|
+
---
|
|
67
|
+
|
|
68
|
+
## Step 2: Capture Evidence
|
|
69
|
+
|
|
70
|
+
Create a baseline that can later prove whether the issue changed.
|
|
71
|
+
|
|
72
|
+
**Capture:**
|
|
73
|
+
- Exact command, URL, user flow, or trigger
|
|
74
|
+
- Exact error message, stack trace, failing assertion, or visual symptom
|
|
75
|
+
- Expected vs actual behavior
|
|
76
|
+
- Relevant logs with timestamps
|
|
77
|
+
- Environment facts: runtime, dependency versions, OS, browser, CI runner, config
|
|
78
|
+
- Whether the issue reproduces consistently or intermittently
|
|
79
|
+
|
|
80
|
+
For frontend issues, use `references/debugger/frontend-verification.md`.
|
|
81
|
+
For CI/log issues, use `references/debugger/log-ci-analysis.md`.
|
|
82
|
+
For performance issues, use `references/debugger/performance-diagnostics.md`.
|
|
83
|
+
|
|
84
|
+
**Output:** `✓ Step 2: Evidence captured - baseline command/symptom recorded`
|
|
85
|
+
|
|
86
|
+
---
|
|
87
|
+
|
|
88
|
+
## Step 3: Pattern Analysis
|
|
89
|
+
|
|
90
|
+
Before proposing a cause, compare against known-good patterns.
|
|
91
|
+
|
|
92
|
+
**Check:**
|
|
93
|
+
- Similar implementation that works
|
|
94
|
+
- Similar tests that pass
|
|
95
|
+
- Recent code that changed the same contract
|
|
96
|
+
- Config/env differences between passing and failing contexts
|
|
97
|
+
- Dependency/API contract changes
|
|
98
|
+
|
|
99
|
+
**Output:** `✓ Step 3: Patterns compared - [working reference] vs [failing path]`
|
|
100
|
+
|
|
101
|
+
---
|
|
102
|
+
|
|
103
|
+
## Step 4: Hypothesis Tests
|
|
104
|
+
|
|
105
|
+
Create 2-3 competing hypotheses. Test one variable at a time.
|
|
106
|
+
|
|
107
|
+
```text
|
|
108
|
+
Hypothesis: [statement]
|
|
109
|
+
Confirm if: [evidence that proves it]
|
|
110
|
+
Refute if: [evidence that disproves it]
|
|
111
|
+
Quick test: [command/search/log/query]
|
|
112
|
+
Result: confirmed | refuted | inconclusive
|
|
113
|
+
```
|
|
114
|
+
|
|
115
|
+
Rules:
|
|
116
|
+
- Never batch unrelated changes as a test.
|
|
117
|
+
- Prefer read-only evidence: logs, grep, stack traces, DB queries, browser traces.
|
|
118
|
+
- For flaky async tests, use `references/debugger/condition-based-waiting.md`.
|
|
119
|
+
- If 2+ hypotheses are refuted, use inversion: ask what evidence would make the current explanation impossible.
|
|
120
|
+
|
|
121
|
+
**Output:** `✓ Step 4: Hypotheses tested - [confirmed/refuted counts]`
|
|
122
|
+
|
|
123
|
+
---
|
|
124
|
+
|
|
125
|
+
## Step 5: Root Cause Trace
|
|
126
|
+
|
|
127
|
+
Trace backward from symptom to origin.
|
|
128
|
+
|
|
129
|
+
```text
|
|
130
|
+
Symptom
|
|
131
|
+
<- immediate cause
|
|
132
|
+
<- contributing factor
|
|
133
|
+
<- ROOT CAUSE
|
|
134
|
+
```
|
|
135
|
+
|
|
136
|
+
**Exact root-cause contract:**
|
|
137
|
+
- Symptom: exact observable failure
|
|
138
|
+
- Reproduction: command/user flow/log trigger
|
|
139
|
+
- Expected vs actual behavior
|
|
140
|
+
- Root cause: file:line or config/env source
|
|
141
|
+
- Why now: recent change, data state, dependency, environment, timing, or load factor
|
|
142
|
+
- Evidence chain: observations that prove this cause
|
|
143
|
+
- Blast radius: files/modules/tests/users/workflows affected
|
|
144
|
+
|
|
145
|
+
**Output:** `✓ Step 5: Root cause traced - [file:line/config/env]`
|
|
146
|
+
|
|
147
|
+
---
|
|
148
|
+
|
|
149
|
+
## Step 6: Blast Radius + Verification Plan
|
|
150
|
+
|
|
151
|
+
Prepare the handoff to `hapo:hotfix` or the user.
|
|
152
|
+
|
|
153
|
+
**Verification plan must include:**
|
|
154
|
+
- Original failing command or reproduction path
|
|
155
|
+
- Targeted regression test or scenario
|
|
156
|
+
- Affected-module tests
|
|
157
|
+
- Typecheck/lint/build commands when relevant
|
|
158
|
+
- UI screenshot/console/network checks when relevant
|
|
159
|
+
- Side-effect sweep from `references/debugger/side-effect-gate.md`
|
|
160
|
+
|
|
161
|
+
**Output:** `✓ Step 6: Verification planned - [commands/scenarios]`
|
|
162
|
+
|
|
163
|
+
---
|
|
164
|
+
|
|
165
|
+
## Diagnostic Report Format
|
|
166
|
+
|
|
167
|
+
```markdown
|
|
168
|
+
## Debug Report
|
|
169
|
+
|
|
170
|
+
**Issue:** [one-line summary]
|
|
171
|
+
**Mode:** quick | standard | ci | frontend | perf
|
|
172
|
+
**Root cause confidence:** high | medium | low | unknown
|
|
173
|
+
|
|
174
|
+
### Root Cause Contract
|
|
175
|
+
- Symptom:
|
|
176
|
+
- Reproduction:
|
|
177
|
+
- Expected:
|
|
178
|
+
- Actual:
|
|
179
|
+
- Root cause:
|
|
180
|
+
- Why now:
|
|
181
|
+
- Evidence chain:
|
|
182
|
+
- Blast radius:
|
|
183
|
+
|
|
184
|
+
### Hypotheses Tested
|
|
185
|
+
1. [confirmed/refuted/inconclusive] [hypothesis] - [evidence]
|
|
186
|
+
|
|
187
|
+
### Verification Plan
|
|
188
|
+
- Original reproduction:
|
|
189
|
+
- Regression guard:
|
|
190
|
+
- Side-effect sweep:
|
|
191
|
+
|
|
192
|
+
### Recommended Fix Direction
|
|
193
|
+
[Smallest root-cause fix, or "insufficient evidence"]
|
|
194
|
+
|
|
195
|
+
### Unresolved Questions
|
|
196
|
+
- [Only if any]
|
|
197
|
+
```
|
|
198
|
+
|
|
199
|
+
## Relationship To Hotfix
|
|
200
|
+
|
|
201
|
+
- Use `hapo:debug` to determine what is wrong.
|
|
202
|
+
- Use `hapo:hotfix` to change code after the root-cause contract is complete.
|
|
203
|
+
- If `hapo:hotfix` verification fails, return to `hapo:debug` with the new evidence.
|
|
204
|
+
|
|
205
|
+
## References
|
|
206
|
+
|
|
207
|
+
Load as needed:
|
|
208
|
+
- `references/debugger/core-philosophy.md` - Anti-guessing discipline
|
|
209
|
+
- `references/debugger/root-cause-tracing.md` - Backward trace to origin
|
|
210
|
+
- `references/debugger/verification-protocol.md` - Fresh evidence requirements
|
|
211
|
+
- `references/debugger/log-ci-analysis.md` - Logs and CI/CD failure analysis
|
|
212
|
+
- `references/debugger/parallel-agent-hydration.md` - Parallel reconnaissance
|
|
213
|
+
- `references/debugger/frontend-verification.md` - Browser/UI verification
|
|
214
|
+
- `references/debugger/performance-diagnostics.md` - Performance investigation
|
|
215
|
+
- `references/debugger/condition-based-waiting.md` - Flaky async test diagnosis
|
|
216
|
+
- `references/debugger/side-effect-gate.md` - Regression and blast-radius checks
|
|
@@ -94,7 +94,7 @@ flowchart TD
|
|
|
94
94
|
- Before coding, set the active task(s) to `in_progress` in both markdown and `spec.json.task_registry`, or route through `/hapo:sync` if the runtime expects the sync protocol.
|
|
95
95
|
|
|
96
96
|
### Step 2: Scout (Codebase Inspection)
|
|
97
|
-
- **Mandatory:** Call agent `
|
|
97
|
+
- **Mandatory:** Call agent `Agent(subagent_type="inspector", ...)` to scan the overall codebase structure (e.g., where components live, where utils are). Avoid wandering into forbidden zones. Use the legacy `Task` tool only in runtimes that have not renamed the subagent tool yet.
|
|
98
98
|
|
|
99
99
|
### Step 3: Implement Code
|
|
100
100
|
- Act as `god-developer` OR directly write code, executing tasks specified in the loaded Markdown file(s) sequentially.
|
|
@@ -35,11 +35,11 @@ START_LOOP:
|
|
|
35
35
|
---------------------------------------------------------------
|
|
36
36
|
PARALLEL GATE: Spawn BOTH agents simultaneously
|
|
37
37
|
---------------------------------------------------------------
|
|
38
|
-
→
|
|
38
|
+
→ Agent(subagent_type="test-runner",
|
|
39
39
|
prompt="Run task-aware verification for the recently implemented code. Read the active task file(s) and execute the exact verification commands named there first, in order. Preflight compile/typecheck/build failures must be reported as PRECHECK_FAIL and take precedence over NO_TESTS. After that, run any additional repo-level typecheck/test/build checks needed for confidence. Inspect named artifacts/runtime outputs. For multi-service tasks, verify the flow does not rely on process-local stand-ins masquerading as shared state. Return PASS only if automated checks and task evidence both pass. Mark anything unexecuted as UNVERIFIED. Treat NO_TESTS as non-passing unless the task did not require a dedicated test suite.",
|
|
40
40
|
description="Test [feature]")
|
|
41
41
|
|
|
42
|
-
→
|
|
42
|
+
→ Agent(subagent_type="code-auditor",
|
|
43
43
|
prompt="Review all recently written code against the active task file(s), referenced requirements, and design contracts. Missing deliverables, placeholder-only wiring, missing runtime entrypoints, overscope edits outside the task packet, silent replacement of named technologies/contracts, or fake cross-service proof via process-local state are Critical even if build/tests pass. Check security, logic, architecture, YAGNI/KISS/DRY. Return score (X/10), critical count, warning list, and evidence gaps.",
|
|
44
44
|
description="Review [feature]")
|
|
45
45
|
|
|
@@ -73,7 +73,7 @@ REVIEW_ONLY:
|
|
|
73
73
|
---------------------------------------------------------------
|
|
74
74
|
Re-run ONLY code-auditor (tests already passed and no new evidence-producing code changed)
|
|
75
75
|
---------------------------------------------------------------
|
|
76
|
-
→
|
|
76
|
+
→ Agent(subagent_type="code-auditor", ...)
|
|
77
77
|
|
|
78
78
|
IF Score >= 9.5 AND Critical = 0 → PASS!
|
|
79
79
|
IF Score < 9.5 OR Critical > 0:
|
|
@@ -2,35 +2,35 @@
|
|
|
2
2
|
|
|
3
3
|
Standard prompt templates for delegating work to subagents within the develop workflow.
|
|
4
4
|
|
|
5
|
-
##
|
|
6
|
-
Use
|
|
5
|
+
## Agent Tool Pattern
|
|
6
|
+
Use the current Claude Code `Agent` tool for subagent invocation. In older runtimes, the same shape may appear as legacy `Task`.
|
|
7
7
|
```
|
|
8
|
-
|
|
8
|
+
Agent(subagent_type="[agent-name]", prompt="[task description]", description="[short title]")
|
|
9
9
|
```
|
|
10
10
|
|
|
11
11
|
## Codebase Inspection Phase
|
|
12
12
|
```
|
|
13
|
-
|
|
13
|
+
Agent(subagent_type="inspector", prompt="Scan and identify all files related to [feature-name] in the current codebase.", description="Scout [feature-name]")
|
|
14
14
|
```
|
|
15
15
|
|
|
16
16
|
## Code Implementation Phase
|
|
17
17
|
```
|
|
18
|
-
|
|
18
|
+
Agent(subagent_type="god-developer", prompt="Implement the sub-tasks from [tasks-directory] based on the specification in [spec.json]", description="Code Feature [feature]")
|
|
19
19
|
```
|
|
20
20
|
|
|
21
21
|
## UI Implementation Phase
|
|
22
22
|
```
|
|
23
|
-
|
|
23
|
+
Agent(subagent_type="ui-ux-designer", prompt="Implement the frontend code for [feature] following ./docs/design-guidelines.md", description="Code UI [feature]")
|
|
24
24
|
```
|
|
25
25
|
|
|
26
26
|
## Code Review Phase
|
|
27
27
|
```
|
|
28
|
-
|
|
28
|
+
Agent(subagent_type="code-auditor", prompt="Review all recently written code. Check for security holes, performance issues, and adherence to YAGNI/KISS/DRY. Return score (X/10), list of critical issues, warnings, and suggestions.", description="Review [phase]")
|
|
29
29
|
```
|
|
30
30
|
|
|
31
31
|
## Test Execution Phase
|
|
32
32
|
```
|
|
33
|
-
|
|
33
|
+
Agent(subagent_type="test-runner",
|
|
34
34
|
prompt="Run tests for recently implemented code. Apply blast-radius scoping
|
|
35
35
|
unless --full is requested. Return structured verdict with Status, Results,
|
|
36
36
|
Coverage, Failures, and Action.",
|
|
@@ -40,7 +40,7 @@ Task(subagent_type="test-runner",
|
|
|
40
40
|
## Parallel Quality Gate (Step 4)
|
|
41
41
|
Spawn both simultaneously — do NOT wait for one before starting the other:
|
|
42
42
|
```
|
|
43
|
-
|
|
44
|
-
|
|
43
|
+
Agent(subagent_type="test-runner", prompt="...", description="Test [feature]")
|
|
44
|
+
Agent(subagent_type="code-auditor", prompt="...", description="Review [feature]")
|
|
45
45
|
```
|
|
46
46
|
Wait for both results → apply quality-gate.md combine logic.
|
|
@@ -85,7 +85,7 @@ Interpret creatively and make unexpected choices that feel genuinely designed fo
|
|
|
85
85
|
|
|
86
86
|
**Remember:** Claude is capable of extraordinary creative work. Don't hold back, show what can truly be created when thinking outside the box and committing fully to a distinctive vision.
|
|
87
87
|
|
|
88
|
-
**Assets**:
|
|
88
|
+
**Assets**: Analyze supplied visual assets with `hapo:ai-multimodal`; generate or process implementation assets with the project's existing image/CSS/build tooling and document output paths in the task report.
|
|
89
89
|
|
|
90
90
|
## Asset & Analysis References
|
|
91
91
|
|
|
@@ -1,11 +1,11 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: hapo:hotfix
|
|
3
|
-
description: "ALWAYS activate this skill
|
|
4
|
-
argument-hint: "[issue] --quick|--parallel"
|
|
3
|
+
description: "ALWAYS activate this skill when you are asked to FIX a bug, error, test failure, CI/CD issue, type error, lint error, log error, UI issue, or code problem. Uses hapo:debug for evidence-first diagnosis before any code change."
|
|
4
|
+
argument-hint: "[issue] --quick|--parallel|--from-debug"
|
|
5
5
|
version: "1.0.0"
|
|
6
6
|
---
|
|
7
7
|
|
|
8
|
-
# Hotfix
|
|
8
|
+
# Hotfix - Structured Bug Elimination
|
|
9
9
|
|
|
10
10
|
Kill bugs systematically. No guessing. Evidence first, fix second.
|
|
11
11
|
|
|
@@ -13,12 +13,15 @@ Kill bugs systematically. No guessing. Evidence first, fix second.
|
|
|
13
13
|
|
|
14
14
|
- `--quick` - Fast track for trivial issues (lint, type errors, syntax)
|
|
15
15
|
- `--parallel` - Spawn multiple `god-developer` agents for independent issues
|
|
16
|
+
- `--from-debug` - Start from an existing `hapo:debug` report and validate its root-cause contract
|
|
16
17
|
|
|
17
18
|
Default: Autonomous mode — auto-approve when confidence is high.
|
|
18
19
|
|
|
19
20
|
<HARD-GATE>
|
|
20
|
-
Do NOT propose or implement fixes before completing Steps 1-2 (Scout +
|
|
21
|
+
Do NOT propose or implement fixes before completing Steps 1-2 (Scout + `hapo:debug` diagnosis).
|
|
21
22
|
Symptom fixes are FAILURE. Find the root cause first.
|
|
23
|
+
The exact root-cause contract is mandatory: symptom, reproduction, expected/actual, root cause file:line or config/env source, why now, evidence chain, blast radius.
|
|
24
|
+
The side-effect gate is mandatory before completion.
|
|
22
25
|
If 3+ fix attempts fail → STOP. Question the architecture. Discuss with user.
|
|
23
26
|
Exception: `--quick` mode allows abbreviated scout→diagnose→fix for trivial issues.
|
|
24
27
|
</HARD-GATE>
|
|
@@ -38,7 +41,7 @@ Exception: `--quick` mode allows abbreviated scout→diagnose→fix for trivial
|
|
|
38
41
|
```mermaid
|
|
39
42
|
flowchart TD
|
|
40
43
|
A[Issue Input] --> B[Step 1: Scout via hapo:inspect]
|
|
41
|
-
B --> C[Step 2: Diagnose
|
|
44
|
+
B --> C[Step 2: Diagnose via hapo:debug]
|
|
42
45
|
C --> D[Step 3: Classify Complexity]
|
|
43
46
|
D -->|Trivial| E[Quick Fix]
|
|
44
47
|
D -->|Standard| F[Standard Fix]
|
|
@@ -49,7 +52,9 @@ flowchart TD
|
|
|
49
52
|
G --> I
|
|
50
53
|
H --> I
|
|
51
54
|
I --> J[Step 5: Verify + Prevent]
|
|
52
|
-
J
|
|
55
|
+
J --> N[Side-Effect Gate]
|
|
56
|
+
N -->|Pass| K[Step 6: Finalize]
|
|
57
|
+
N -->|Regression Risk| C
|
|
53
58
|
J -->|Fail, <3 attempts| C
|
|
54
59
|
J -->|Fail, 3+ attempts| L[Question Architecture → Discuss with User]
|
|
55
60
|
K --> M[Report + Commit]
|
|
@@ -81,11 +86,11 @@ flowchart TD
|
|
|
81
86
|
|
|
82
87
|
---
|
|
83
88
|
|
|
84
|
-
## Step 2: Diagnose (MANDATORY — never skip)
|
|
89
|
+
## Step 2: Diagnose via `hapo:debug` (MANDATORY — never skip)
|
|
85
90
|
|
|
86
91
|
**Purpose:** Evidence-based root cause analysis. NO guessing.
|
|
87
92
|
|
|
88
|
-
See `references/diagnosis-protocol.md` for full methodology.
|
|
93
|
+
Use `hapo:debug` or validate an existing debug report when `--from-debug` is provided. See `references/diagnosis-protocol.md` for full methodology.
|
|
89
94
|
|
|
90
95
|
**Mandatory chain:**
|
|
91
96
|
1. **Capture pre-fix state:** Record exact error messages, failing test output, stack traces. This is your baseline for Step 5.
|
|
@@ -101,6 +106,15 @@ See `references/diagnosis-protocol.md` for full methodology.
|
|
|
101
106
|
5. **Trace root cause:** Follow the chain backward — symptom → immediate cause → contributing factor → **ROOT CAUSE**.
|
|
102
107
|
6. **Escalate:** If 2+ hypotheses fail, apply Inversion Thinking (see `references/escalation-tactics.md`).
|
|
103
108
|
|
|
109
|
+
**Exact root-cause contract:**
|
|
110
|
+
- Symptom: exact observable failure
|
|
111
|
+
- Reproduction: command, user flow, CI job, log trigger, or route
|
|
112
|
+
- Expected vs actual behavior
|
|
113
|
+
- Root cause: file:line, config, environment, dependency, or data source
|
|
114
|
+
- Why now: recent change, dependency drift, data state, environment, timing, or load
|
|
115
|
+
- Evidence chain: observations proving the cause
|
|
116
|
+
- Blast radius: affected files, modules, tests, users, workflows, or release paths
|
|
117
|
+
|
|
104
118
|
**Output:** `✓ Step 2: Diagnosed — Root cause: [summary], Evidence: [brief], Scope: [N files]`
|
|
105
119
|
|
|
106
120
|
---
|
|
@@ -139,7 +153,7 @@ See `references/diagnosis-protocol.md` for full methodology.
|
|
|
139
153
|
3. Run full test suite
|
|
140
154
|
|
|
141
155
|
### Deep Workflow
|
|
142
|
-
1. **Parallel investigation:**
|
|
156
|
+
1. **Parallel investigation:** After initial scope is known, gather evidence in parallel — Scout (`hapo:inspect`), Diagnose (`hapo:debug`), and Research (`researcher` subagent) can run concurrently. See `references/parallel-patterns.md` Pattern E.
|
|
143
157
|
2. Synthesize findings from all three into a unified fix approach
|
|
144
158
|
3. Plan the fix (consider writing to `references/` for future use)
|
|
145
159
|
4. Implement in stages, verifying each stage
|
|
@@ -164,7 +178,8 @@ See `references/diagnosis-protocol.md` for full methodology.
|
|
|
164
178
|
2. **Regression test:** The test MUST fail without the fix and pass with it.
|
|
165
179
|
3. **Parallel verification:** Run typecheck + lint + build + test simultaneously via `Bash`. See `references/parallel-patterns.md` Pattern C.
|
|
166
180
|
4. **Prevention guard (Standard+ only):** See `references/prevention-gate.md`.
|
|
167
|
-
5. **
|
|
181
|
+
5. **Side-effect gate:** Run the sweep in `references/debugger/side-effect-gate.md`.
|
|
182
|
+
6. **Code review:** Trigger `hapo:code-review` and handle results per `references/review-cycle.md` (autonomous auto-approve loop or HITL depending on fix criticality).
|
|
168
183
|
|
|
169
184
|
**If verification fails:**
|
|
170
185
|
- < 3 attempts → Loop back to Step 2 (re-diagnose with new evidence)
|
|
@@ -201,6 +216,7 @@ Unified step markers (emit after each step):
|
|
|
201
216
|
|
|
202
217
|
| Subagent | When |
|
|
203
218
|
|----------|------|
|
|
219
|
+
| `hapo:debug` | Mandatory diagnosis gate before code edits (Step 2) |
|
|
204
220
|
| `debugger` | Root cause unclear, need deep systematic investigation (Step 2) |
|
|
205
221
|
| `Explore` (parallel) | Scout multiple areas (Step 1), test hypotheses (Step 2) |
|
|
206
222
|
| `Bash` (parallel) | Verify: typecheck + lint + build + test (Step 5) |
|
|
@@ -220,3 +236,7 @@ Load as needed:
|
|
|
220
236
|
- `references/review-cycle.md` — Autonomous approval loop + HITL review handling
|
|
221
237
|
- `references/parallel-patterns.md` — Parallel Explore/Bash/Task coordination with code templates
|
|
222
238
|
- `references/workflow-specialized.md` — CI/CD, test, TypeScript, UI-specific workflows
|
|
239
|
+
- `references/debugger/frontend-verification.md` — Browser/UI evidence and visual verification
|
|
240
|
+
- `references/debugger/performance-diagnostics.md` — Baseline-driven performance diagnosis
|
|
241
|
+
- `references/debugger/condition-based-waiting.md` — Flaky async test diagnosis
|
|
242
|
+
- `references/debugger/side-effect-gate.md` — Regression and blast-radius sweep
|
|
@@ -1,11 +1,13 @@
|
|
|
1
1
|
# Diagnosis Protocol
|
|
2
2
|
|
|
3
|
-
Structured root cause analysis methodology. Replaces ad-hoc guessing with evidence-based investigation.
|
|
3
|
+
Structured root cause analysis methodology. Replaces ad-hoc guessing with evidence-based investigation. Prefer running `hapo:debug` directly; use this file as the hotfix-local checklist.
|
|
4
4
|
|
|
5
5
|
## Core Principle
|
|
6
6
|
|
|
7
7
|
**NEVER guess root causes.** Form hypotheses through structured reasoning and test them against evidence.
|
|
8
8
|
|
|
9
|
+
**NO FIXES IN DIAGNOSIS.** Product-code changes start only after the exact root-cause contract is complete.
|
|
10
|
+
|
|
9
11
|
## Pre-Diagnosis: Capture State (MANDATORY)
|
|
10
12
|
|
|
11
13
|
Before any investigation, capture the current broken state as baseline:
|
|
@@ -58,9 +60,9 @@ Spawn parallel `Explore` subagents to test each hypothesis simultaneously:
|
|
|
58
60
|
|
|
59
61
|
```
|
|
60
62
|
// Launch in SINGLE message — max 3 parallel agents
|
|
61
|
-
|
|
62
|
-
|
|
63
|
-
|
|
63
|
+
Agent(subagent_type="Explore", prompt="Test hypothesis A: [specific search/check]")
|
|
64
|
+
Agent(subagent_type="Explore", prompt="Test hypothesis B: [specific search/check]")
|
|
65
|
+
Agent(subagent_type="Explore", prompt="Test hypothesis C: [specific search/check]")
|
|
64
66
|
```
|
|
65
67
|
|
|
66
68
|
**For each hypothesis result:**
|
|
@@ -81,6 +83,19 @@ Symptom (where error appears)
|
|
|
81
83
|
|
|
82
84
|
**Rule:** NEVER fix where the error appears. Trace back to the source.
|
|
83
85
|
|
|
86
|
+
## Exact Root-Cause Contract (MANDATORY)
|
|
87
|
+
|
|
88
|
+
Before Step 4 implementation in `hapo:hotfix`, record:
|
|
89
|
+
|
|
90
|
+
- Symptom: exact observable failure
|
|
91
|
+
- Reproduction: command, user flow, CI job, log trigger, or route
|
|
92
|
+
- Expected: intended behavior
|
|
93
|
+
- Actual: observed behavior
|
|
94
|
+
- Root cause: file:line, config, environment, dependency, or data source
|
|
95
|
+
- Why now: recent change, data state, dependency drift, environment, timing, or load factor
|
|
96
|
+
- Evidence chain: observations proving this cause
|
|
97
|
+
- Blast radius: affected files, modules, tests, users, workflows, or release paths
|
|
98
|
+
|
|
84
99
|
### Phase 5: Escalate — When hypotheses fail
|
|
85
100
|
|
|
86
101
|
If 2+ hypotheses are REFUTED → see `escalation-tactics.md`.
|
|
@@ -96,6 +111,15 @@ If 2+ hypotheses are REFUTED → see `escalation-tactics.md`.
|
|
|
96
111
|
### Root Cause
|
|
97
112
|
[Clear explanation traced back to origin]
|
|
98
113
|
|
|
114
|
+
### Exact Root-Cause Contract
|
|
115
|
+
- Symptom:
|
|
116
|
+
- Reproduction:
|
|
117
|
+
- Expected:
|
|
118
|
+
- Actual:
|
|
119
|
+
- Root cause:
|
|
120
|
+
- Why now:
|
|
121
|
+
- Blast radius:
|
|
122
|
+
|
|
99
123
|
### Evidence Chain
|
|
100
124
|
1. [Observation] → led to hypothesis [X]
|
|
101
125
|
2. [Test result] → confirmed/refuted [X]
|
|
@@ -18,9 +18,9 @@ When you need to understand multiple areas of the codebase before diagnosing:
|
|
|
18
18
|
|
|
19
19
|
```
|
|
20
20
|
// Spawn in a SINGLE message — agents run concurrently
|
|
21
|
-
|
|
22
|
-
|
|
23
|
-
|
|
21
|
+
Agent(subagent_type="Explore", prompt="Scan src/auth/ for token validation logic and recent changes")
|
|
22
|
+
Agent(subagent_type="Explore", prompt="Scan src/middleware/ for request interceptors that touch headers")
|
|
23
|
+
Agent(subagent_type="Explore", prompt="Find all test files matching *auth*.test.* and check coverage")
|
|
24
24
|
```
|
|
25
25
|
|
|
26
26
|
Wait for all agents to return. Merge their findings into a unified context map before proceeding to diagnosis.
|
|
@@ -30,9 +30,9 @@ Wait for all agents to return. Merge their findings into a unified context map b
|
|
|
30
30
|
After forming 2-3 hypotheses in Step 2 (Diagnose), test them concurrently:
|
|
31
31
|
|
|
32
32
|
```
|
|
33
|
-
|
|
34
|
-
|
|
35
|
-
|
|
33
|
+
Agent(subagent_type="Explore", prompt="Verify hypothesis: cache returns stale data — check TTL config in src/cache/")
|
|
34
|
+
Agent(subagent_type="Explore", prompt="Verify hypothesis: race condition in login flow — trace async calls in src/auth/login.ts")
|
|
35
|
+
Agent(subagent_type="Explore", prompt="Verify hypothesis: env var missing in production — check .env.example vs deployed config")
|
|
36
36
|
```
|
|
37
37
|
|
|
38
38
|
Each agent returns CONFIRMED, REFUTED, or INCONCLUSIVE with evidence.
|
|
@@ -42,10 +42,10 @@ Each agent returns CONFIRMED, REFUTED, or INCONCLUSIVE with evidence.
|
|
|
42
42
|
After implementing a fix, validate from every angle simultaneously:
|
|
43
43
|
|
|
44
44
|
```
|
|
45
|
-
|
|
46
|
-
|
|
47
|
-
|
|
48
|
-
|
|
45
|
+
Bash: npx tsc --noEmit // Typecheck
|
|
46
|
+
Bash: npx eslint src/ --quiet // Lint
|
|
47
|
+
Bash: npm run build // Build
|
|
48
|
+
Bash: npm test -- --bail // Tests
|
|
49
49
|
```
|
|
50
50
|
|
|
51
51
|
All four must pass. If any fails, investigate that specific failure before re-attempting.
|
|
@@ -79,10 +79,10 @@ In complex bugs (Deep workflow), Steps 1+2+3 should run **concurrently** to save
|
|
|
79
79
|
|
|
80
80
|
```
|
|
81
81
|
// All three launch simultaneously:
|
|
82
|
-
|
|
82
|
+
Agent(subagent_type="Explore", prompt="Scout: map affected files, dependencies, and test coverage")
|
|
83
83
|
// Meanwhile, main agent starts diagnosis using available error context
|
|
84
84
|
// Meanwhile:
|
|
85
|
-
|
|
85
|
+
Agent(subagent_type="researcher", prompt="Research: find latest docs and known issues for [library/framework]")
|
|
86
86
|
```
|
|
87
87
|
|
|
88
88
|
The main agent begins hypothesis formation (Step 2) immediately using the error message and stack trace. Scout results enrich the diagnosis when they arrive. Research results inform the fix approach.
|
|
@@ -98,6 +98,6 @@ The main agent begins hypothesis formation (Step 2) immediately using the error
|
|
|
98
98
|
|
|
99
99
|
## Fallback: When Task Tools Are Unavailable
|
|
100
100
|
|
|
101
|
-
`TaskCreate`/`TaskUpdate` are
|
|
101
|
+
`TaskCreate`/`TaskUpdate` are task-list tools and can be unavailable in some runtimes. If they fail:
|
|
102
102
|
- Track progress manually using markdown checklists
|
|
103
103
|
- The fix workflow itself remains fully functional — Tasks add visibility, not core logic
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
# Prevention Gate
|
|
2
2
|
|
|
3
|
-
After a fix is verified, apply defense-in-depth to prevent the same bug class from recurring.
|
|
3
|
+
After a fix is verified, apply defense-in-depth to prevent the same bug class from recurring. Prevention is not the same as side-effect safety; run `references/debugger/side-effect-gate.md` before claiming completion.
|
|
4
4
|
|
|
5
5
|
## Mandatory Prevention Checklist
|
|
6
6
|
|
|
@@ -29,6 +29,13 @@ If the bug could recur silently:
|
|
|
29
29
|
- Include structured context (IDs, timestamps, relevant state)
|
|
30
30
|
- Ensure the log level is appropriate (warn for recoverable, error for critical)
|
|
31
31
|
|
|
32
|
+
### 5. Layered Defense Guard
|
|
33
|
+
If the same invalid state could enter through multiple paths:
|
|
34
|
+
- Validate at the entry point where data first crosses a trust boundary
|
|
35
|
+
- Validate in business logic where invariants must hold
|
|
36
|
+
- Add environment/config guards when deployment conditions matter
|
|
37
|
+
- Add diagnostic logging only at decision points that help future debugging
|
|
38
|
+
|
|
32
39
|
## Quick Mode Prevention
|
|
33
40
|
|
|
34
41
|
For trivial fixes (lint, type errors), prevention is optional but encouraged:
|
|
@@ -27,6 +27,7 @@ When unit/integration/e2e tests are failing:
|
|
|
27
27
|
2. **Check for pollution:** Does it pass alone but fail in suite? → Test order dependency or shared state leaking
|
|
28
28
|
3. **Check for staleness:** Did the implementation change but the test assertions were not updated?
|
|
29
29
|
4. **Snapshot drift:** If snapshot tests fail, review the diff carefully — is it an intentional change or a regression?
|
|
30
|
+
5. **Flaky async:** Replace arbitrary sleeps with condition-based waits. See `references/debugger/condition-based-waiting.md`.
|
|
30
31
|
|
|
31
32
|
---
|
|
32
33
|
|
|
@@ -51,7 +52,8 @@ When the interface is broken, misaligned, or not rendering:
|
|
|
51
52
|
1. **Capture visual evidence:** Use `pushd skills/chrome-devtools/scripts && node screenshot.js --url <url> && popd`
|
|
52
53
|
2. **Check ARIA structure:** `pushd skills/chrome-devtools/scripts && node aria-snapshot.js --url <url> && popd` — reveals hidden overlays, z-index battles
|
|
53
54
|
3. **Check console errors:** `pushd skills/chrome-devtools/scripts && node console.js --url <url> && popd` — catches JS crashes preventing render
|
|
54
|
-
4. **
|
|
55
|
+
4. **Run frontend verification:** Follow `references/debugger/frontend-verification.md` for screenshot, console, network, accessibility, responsive, and interaction evidence.
|
|
56
|
+
5. **Common traps:**
|
|
55
57
|
- CSS specificity wars (use browser DevTools or ARIA snapshot to verify computed styles)
|
|
56
58
|
- Hydration mismatch in SSR frameworks (server HTML differs from client render)
|
|
57
59
|
- Missing responsive breakpoints (test at multiple viewport widths)
|
|
@@ -37,7 +37,7 @@ Instead of rejecting, use a **2-phase approach**: first run a lightweight Struct
|
|
|
37
37
|
|
|
38
38
|
Spawn **one dedicated scout agent** to map the top-level structure before any work is divided:
|
|
39
39
|
|
|
40
|
-
1. **Discover top-level layout** - Use `Glob`
|
|
40
|
+
1. **Discover top-level layout** - Use `Glob` or `Bash` `ls` to list immediate children of the scope root:
|
|
41
41
|
- Top-level directories (src/, apps/, backend/, frontend/, packages/, etc.)
|
|
42
42
|
- Key config files (README.md, package.json, tsconfig.json, pyproject.toml, go.mod, etc.)
|
|
43
43
|
- Monorepo markers (packages/*, apps/*, lerna.json, pnpm-workspace.yaml, turbo.json)
|
|
@@ -113,7 +113,7 @@ Follow-up: "Want to investigate deeper? Choose: backend API | frontend component
|
|
|
113
113
|
|
|
114
114
|
## Configuration
|
|
115
115
|
|
|
116
|
-
Read from
|
|
116
|
+
Read from `.claude/runtime.json`:
|
|
117
117
|
```json
|
|
118
118
|
{
|
|
119
119
|
"gemini": {
|