@haposoft/cafekit 0.8.0 → 0.8.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (42) hide show
  1. package/README.md +2 -0
  2. package/package.json +6 -3
  3. package/src/claude/CLAUDE.md +1 -0
  4. package/src/claude/agents/debugger.md +58 -4
  5. package/src/claude/agents/docs-keeper.md +1 -1
  6. package/src/claude/agents/god-developer.md +2 -2
  7. package/src/claude/agents/project-manager.md +1 -1
  8. package/src/claude/agents/spec-maker.md +22 -19
  9. package/src/claude/agents/test-runner.md +1 -0
  10. package/src/claude/agents/ui-ux-designer.md +3 -3
  11. package/src/claude/migration-manifest.json +1 -0
  12. package/src/claude/references/debugger/condition-based-waiting.md +56 -0
  13. package/src/claude/references/debugger/frontend-verification.md +59 -0
  14. package/src/claude/references/debugger/performance-diagnostics.md +76 -0
  15. package/src/claude/references/debugger/side-effect-gate.md +48 -0
  16. package/src/claude/rules/manage-docs.md +2 -2
  17. package/src/claude/settings/settings.json +1 -1
  18. package/src/claude/skills/ai-multimodal/SKILL.md +1 -1
  19. package/src/claude/skills/brainstorm/SKILL.md +2 -2
  20. package/src/claude/skills/chrome-devtools/SKILL.md +1 -1
  21. package/src/claude/skills/code-review/SKILL.md +1 -1
  22. package/src/claude/skills/debug/SKILL.md +216 -0
  23. package/src/claude/skills/develop/SKILL.md +1 -1
  24. package/src/claude/skills/develop/references/quality-gate.md +3 -3
  25. package/src/claude/skills/develop/references/subagent-patterns.md +10 -10
  26. package/src/claude/skills/frontend-design/SKILL.md +1 -1
  27. package/src/claude/skills/hotfix/SKILL.md +30 -10
  28. package/src/claude/skills/hotfix/references/diagnosis-protocol.md +28 -4
  29. package/src/claude/skills/hotfix/references/parallel-patterns.md +13 -13
  30. package/src/claude/skills/hotfix/references/prevention-gate.md +8 -1
  31. package/src/claude/skills/hotfix/references/workflow-specialized.md +3 -1
  32. package/src/claude/skills/inspect/SKILL.md +2 -2
  33. package/src/claude/skills/inspect/references/external-gemini-inspection.md +11 -11
  34. package/src/claude/skills/research/SKILL.md +1 -1
  35. package/src/claude/skills/specs/SKILL.md +29 -16
  36. package/src/claude/skills/specs/references/codebase-analysis.md +34 -3
  37. package/src/claude/skills/specs/references/research-strategy.md +54 -7
  38. package/src/claude/skills/specs/templates/research.md +46 -0
  39. package/src/claude/skills/test/SKILL.md +1 -1
  40. package/src/claude/skills/ai-multimodal/scripts/.coverage +0 -0
  41. package/src/claude/skills/ai-multimodal/scripts/tests/.coverage +0 -0
  42. package/src/claude/skills/pdf/scripts/__pycache__/check_bounding_boxes.cpython-314.pyc +0 -0
@@ -291,7 +291,7 @@ node $SKILL_DIR/packages/spec/src/claude/chrome-devtools/tmp/login-test.js
291
291
 
292
292
  Skills can exist in **project-scope** or **user-scope**. Priority: project-scope > user-scope.
293
293
 
294
- **IMPORTANT:** Invoke "/hapo:project-organization" skill to organize the outputs.
294
+ **IMPORTANT:** Store browser artifacts in a project-local screenshots or reports folder and include exact output paths in the final report.
295
295
 
296
296
  Store screenshots for analysis in `<project>/packages/spec/src/claude/chrome-devtools/screenshots/`:
297
297
 
@@ -51,7 +51,7 @@ Does the code match what was requested?
51
51
 
52
52
  ### Stage 3 — Adversarial Review (Red-Team)
53
53
  Actively try to break the code.
54
- - **Edge Case Scouting:** If the Pull Request modifies >= 5 files, invoke the `inspect` agent to scout the codebase to see where modified functions/components are imported and if boundary errors exist before finishing the review.
54
+ - **Edge Case Scouting:** If the Pull Request modifies >= 5 files, activate `hapo:inspect` or call the `inspector` agent to scout where modified functions/components are imported and whether boundary errors exist before finishing the review.
55
55
  - Find security holes (XSS, SQL Injection, Hardcoded tokens, Exposed Secrets).
56
56
  - Find false assumptions, resource exhaustion loops, and race conditions.
57
57
  - Find unhandled edge cases (e.g. empty strings, null pointers, negative integers).
@@ -0,0 +1,216 @@
1
+ ---
2
+ name: hapo:debug
3
+ description: "Use before fixing any bug, failing test, CI/CD failure, production incident, performance issue, UI regression, flaky test, or unexpected behavior. Diagnostic-only root-cause workflow with evidence, hypotheses, blast-radius mapping, and verification plan."
4
+ argument-hint: "[issue] --quick|--ci|--frontend|--perf"
5
+ version: "1.0.0"
6
+ ---
7
+
8
+ # Debug - Evidence-First Root Cause Analysis
9
+
10
+ Debugging is diagnosis, not repair. Find the source of the failure before changing product code.
11
+
12
+ ## Arguments
13
+
14
+ - `--quick` - Abbreviated path for syntax, lint, type, or single-test failures with obvious local scope
15
+ - `--ci` - Focus on CI/CD logs, runner environment, dependency versions, and pipeline setup
16
+ - `--frontend` - Include browser console, screenshot, accessibility tree, network, and responsive checks
17
+ - `--perf` - Include baseline measurements, bottleneck layer, profiling, and before/after targets
18
+
19
+ Default: systematic diagnosis with no product-code edits.
20
+
21
+ <HARD-GATE>
22
+ Do NOT implement a fix inside `hapo:debug`.
23
+ Do NOT recommend a fix until the root-cause contract is complete.
24
+ Do NOT stop at the first plausible explanation. Test hypotheses against evidence.
25
+ If 2+ hypotheses are refuted, change strategy before continuing.
26
+ If evidence is insufficient, report `Root cause: unknown` with the missing evidence needed.
27
+ </HARD-GATE>
28
+
29
+ Temporary instrumentation is allowed only when it is the minimal way to observe hidden state. Remove it before finishing and report what was instrumented.
30
+
31
+ ## Process Flow
32
+
33
+ ```mermaid
34
+ flowchart TD
35
+ A[Issue Input] --> B[Step 1: Scout via hapo:inspect]
36
+ B --> C[Step 2: Capture Evidence]
37
+ C --> D[Step 3: Pattern Analysis]
38
+ D --> E[Step 4: Hypothesis Tests]
39
+ E --> F[Step 5: Root Cause Trace]
40
+ F --> G[Step 6: Blast Radius + Verification Plan]
41
+ G --> H[Diagnostic Report]
42
+ H --> I{Fix requested?}
43
+ I -->|Yes| J[Hand off to hapo:hotfix]
44
+ I -->|No| K[Stop after diagnosis]
45
+ ```
46
+
47
+ **This diagram is the authoritative workflow.** `hapo:debug` stops at diagnosis unless the user explicitly asks to fix.
48
+
49
+ ---
50
+
51
+ ## Step 1: Scout
52
+
53
+ Understand the affected code before forming hypotheses.
54
+
55
+ **Action:** Activate `hapo:inspect` for the relevant scope.
56
+
57
+ **Checklist:**
58
+ - [ ] Affected files and modules identified
59
+ - [ ] Direct dependencies and call paths mapped
60
+ - [ ] Related tests located
61
+ - [ ] Recent changes checked: `git log --oneline -10 -- <affected-files>`
62
+ - [ ] Existing working examples or adjacent patterns identified
63
+
64
+ **Output:** `✓ Step 1: Scouted - [N] files, [M] deps, [K] tests`
65
+
66
+ ---
67
+
68
+ ## Step 2: Capture Evidence
69
+
70
+ Create a baseline that can later prove whether the issue changed.
71
+
72
+ **Capture:**
73
+ - Exact command, URL, user flow, or trigger
74
+ - Exact error message, stack trace, failing assertion, or visual symptom
75
+ - Expected vs actual behavior
76
+ - Relevant logs with timestamps
77
+ - Environment facts: runtime, dependency versions, OS, browser, CI runner, config
78
+ - Whether the issue reproduces consistently or intermittently
79
+
80
+ For frontend issues, use `references/debugger/frontend-verification.md`.
81
+ For CI/log issues, use `references/debugger/log-ci-analysis.md`.
82
+ For performance issues, use `references/debugger/performance-diagnostics.md`.
83
+
84
+ **Output:** `✓ Step 2: Evidence captured - baseline command/symptom recorded`
85
+
86
+ ---
87
+
88
+ ## Step 3: Pattern Analysis
89
+
90
+ Before proposing a cause, compare against known-good patterns.
91
+
92
+ **Check:**
93
+ - Similar implementation that works
94
+ - Similar tests that pass
95
+ - Recent code that changed the same contract
96
+ - Config/env differences between passing and failing contexts
97
+ - Dependency/API contract changes
98
+
99
+ **Output:** `✓ Step 3: Patterns compared - [working reference] vs [failing path]`
100
+
101
+ ---
102
+
103
+ ## Step 4: Hypothesis Tests
104
+
105
+ Create 2-3 competing hypotheses. Test one variable at a time.
106
+
107
+ ```text
108
+ Hypothesis: [statement]
109
+ Confirm if: [evidence that proves it]
110
+ Refute if: [evidence that disproves it]
111
+ Quick test: [command/search/log/query]
112
+ Result: confirmed | refuted | inconclusive
113
+ ```
114
+
115
+ Rules:
116
+ - Never batch unrelated changes as a test.
117
+ - Prefer read-only evidence: logs, grep, stack traces, DB queries, browser traces.
118
+ - For flaky async tests, use `references/debugger/condition-based-waiting.md`.
119
+ - If 2+ hypotheses are refuted, use inversion: ask what evidence would make the current explanation impossible.
120
+
121
+ **Output:** `✓ Step 4: Hypotheses tested - [confirmed/refuted counts]`
122
+
123
+ ---
124
+
125
+ ## Step 5: Root Cause Trace
126
+
127
+ Trace backward from symptom to origin.
128
+
129
+ ```text
130
+ Symptom
131
+ <- immediate cause
132
+ <- contributing factor
133
+ <- ROOT CAUSE
134
+ ```
135
+
136
+ **Exact root-cause contract:**
137
+ - Symptom: exact observable failure
138
+ - Reproduction: command/user flow/log trigger
139
+ - Expected vs actual behavior
140
+ - Root cause: file:line or config/env source
141
+ - Why now: recent change, data state, dependency, environment, timing, or load factor
142
+ - Evidence chain: observations that prove this cause
143
+ - Blast radius: files/modules/tests/users/workflows affected
144
+
145
+ **Output:** `✓ Step 5: Root cause traced - [file:line/config/env]`
146
+
147
+ ---
148
+
149
+ ## Step 6: Blast Radius + Verification Plan
150
+
151
+ Prepare the handoff to `hapo:hotfix` or the user.
152
+
153
+ **Verification plan must include:**
154
+ - Original failing command or reproduction path
155
+ - Targeted regression test or scenario
156
+ - Affected-module tests
157
+ - Typecheck/lint/build commands when relevant
158
+ - UI screenshot/console/network checks when relevant
159
+ - Side-effect sweep from `references/debugger/side-effect-gate.md`
160
+
161
+ **Output:** `✓ Step 6: Verification planned - [commands/scenarios]`
162
+
163
+ ---
164
+
165
+ ## Diagnostic Report Format
166
+
167
+ ```markdown
168
+ ## Debug Report
169
+
170
+ **Issue:** [one-line summary]
171
+ **Mode:** quick | standard | ci | frontend | perf
172
+ **Root cause confidence:** high | medium | low | unknown
173
+
174
+ ### Root Cause Contract
175
+ - Symptom:
176
+ - Reproduction:
177
+ - Expected:
178
+ - Actual:
179
+ - Root cause:
180
+ - Why now:
181
+ - Evidence chain:
182
+ - Blast radius:
183
+
184
+ ### Hypotheses Tested
185
+ 1. [confirmed/refuted/inconclusive] [hypothesis] - [evidence]
186
+
187
+ ### Verification Plan
188
+ - Original reproduction:
189
+ - Regression guard:
190
+ - Side-effect sweep:
191
+
192
+ ### Recommended Fix Direction
193
+ [Smallest root-cause fix, or "insufficient evidence"]
194
+
195
+ ### Unresolved Questions
196
+ - [Only if any]
197
+ ```
198
+
199
+ ## Relationship To Hotfix
200
+
201
+ - Use `hapo:debug` to determine what is wrong.
202
+ - Use `hapo:hotfix` to change code after the root-cause contract is complete.
203
+ - If `hapo:hotfix` verification fails, return to `hapo:debug` with the new evidence.
204
+
205
+ ## References
206
+
207
+ Load as needed:
208
+ - `references/debugger/core-philosophy.md` - Anti-guessing discipline
209
+ - `references/debugger/root-cause-tracing.md` - Backward trace to origin
210
+ - `references/debugger/verification-protocol.md` - Fresh evidence requirements
211
+ - `references/debugger/log-ci-analysis.md` - Logs and CI/CD failure analysis
212
+ - `references/debugger/parallel-agent-hydration.md` - Parallel reconnaissance
213
+ - `references/debugger/frontend-verification.md` - Browser/UI verification
214
+ - `references/debugger/performance-diagnostics.md` - Performance investigation
215
+ - `references/debugger/condition-based-waiting.md` - Flaky async test diagnosis
216
+ - `references/debugger/side-effect-gate.md` - Regression and blast-radius checks
@@ -94,7 +94,7 @@ flowchart TD
94
94
  - Before coding, set the active task(s) to `in_progress` in both markdown and `spec.json.task_registry`, or route through `/hapo:sync` if the runtime expects the sync protocol.
95
95
 
96
96
  ### Step 2: Scout (Codebase Inspection)
97
- - **Mandatory:** Call agent `Task(subagent_type="inspector", ...)` to scan the overall codebase structure (e.g., where components live, where utils are). Avoid wandering into forbidden zones.
97
+ - **Mandatory:** Call agent `Agent(subagent_type="inspector", ...)` to scan the overall codebase structure (e.g., where components live, where utils are). Avoid wandering into forbidden zones. Use the legacy `Task` tool only in runtimes that have not renamed the subagent tool yet.
98
98
 
99
99
  ### Step 3: Implement Code
100
100
  - Act as `god-developer` OR directly write code, executing tasks specified in the loaded Markdown file(s) sequentially.
@@ -35,11 +35,11 @@ START_LOOP:
35
35
  ---------------------------------------------------------------
36
36
  PARALLEL GATE: Spawn BOTH agents simultaneously
37
37
  ---------------------------------------------------------------
38
- Task(subagent_type="test-runner",
38
+ Agent(subagent_type="test-runner",
39
39
  prompt="Run task-aware verification for the recently implemented code. Read the active task file(s) and execute the exact verification commands named there first, in order. Preflight compile/typecheck/build failures must be reported as PRECHECK_FAIL and take precedence over NO_TESTS. After that, run any additional repo-level typecheck/test/build checks needed for confidence. Inspect named artifacts/runtime outputs. For multi-service tasks, verify the flow does not rely on process-local stand-ins masquerading as shared state. Return PASS only if automated checks and task evidence both pass. Mark anything unexecuted as UNVERIFIED. Treat NO_TESTS as non-passing unless the task did not require a dedicated test suite.",
40
40
  description="Test [feature]")
41
41
 
42
- Task(subagent_type="code-auditor",
42
+ Agent(subagent_type="code-auditor",
43
43
  prompt="Review all recently written code against the active task file(s), referenced requirements, and design contracts. Missing deliverables, placeholder-only wiring, missing runtime entrypoints, overscope edits outside the task packet, silent replacement of named technologies/contracts, or fake cross-service proof via process-local state are Critical even if build/tests pass. Check security, logic, architecture, YAGNI/KISS/DRY. Return score (X/10), critical count, warning list, and evidence gaps.",
44
44
  description="Review [feature]")
45
45
 
@@ -73,7 +73,7 @@ REVIEW_ONLY:
73
73
  ---------------------------------------------------------------
74
74
  Re-run ONLY code-auditor (tests already passed and no new evidence-producing code changed)
75
75
  ---------------------------------------------------------------
76
- Task(subagent_type="code-auditor", ...)
76
+ Agent(subagent_type="code-auditor", ...)
77
77
 
78
78
  IF Score >= 9.5 AND Critical = 0 → PASS!
79
79
  IF Score < 9.5 OR Critical > 0:
@@ -2,35 +2,35 @@
2
2
 
3
3
  Standard prompt templates for delegating work to subagents within the develop workflow.
4
4
 
5
- ## Task Tool Pattern
6
- Use when the Task tool is available in the environment:
5
+ ## Agent Tool Pattern
6
+ Use the current Claude Code `Agent` tool for subagent invocation. In older runtimes, the same shape may appear as legacy `Task`.
7
7
  ```
8
- Task(subagent_type="[agent-name]", prompt="[task description]", description="[short title]")
8
+ Agent(subagent_type="[agent-name]", prompt="[task description]", description="[short title]")
9
9
  ```
10
10
 
11
11
  ## Codebase Inspection Phase
12
12
  ```
13
- Task(subagent_type="inspector", prompt="Scan and identify all files related to [feature-name] in the current codebase.", description="Scout [feature-name]")
13
+ Agent(subagent_type="inspector", prompt="Scan and identify all files related to [feature-name] in the current codebase.", description="Scout [feature-name]")
14
14
  ```
15
15
 
16
16
  ## Code Implementation Phase
17
17
  ```
18
- Task(subagent_type="god-developer", prompt="Implement the sub-tasks from [tasks-directory] based on the specification in [spec.json]", description="Code Feature [feature]")
18
+ Agent(subagent_type="god-developer", prompt="Implement the sub-tasks from [tasks-directory] based on the specification in [spec.json]", description="Code Feature [feature]")
19
19
  ```
20
20
 
21
21
  ## UI Implementation Phase
22
22
  ```
23
- Task(subagent_type="ui-ux-designer", prompt="Implement the frontend code for [feature] following ./docs/design-guidelines.md", description="Code UI [feature]")
23
+ Agent(subagent_type="ui-ux-designer", prompt="Implement the frontend code for [feature] following ./docs/design-guidelines.md", description="Code UI [feature]")
24
24
  ```
25
25
 
26
26
  ## Code Review Phase
27
27
  ```
28
- Task(subagent_type="code-auditor", prompt="Review all recently written code. Check for security holes, performance issues, and adherence to YAGNI/KISS/DRY. Return score (X/10), list of critical issues, warnings, and suggestions.", description="Review [phase]")
28
+ Agent(subagent_type="code-auditor", prompt="Review all recently written code. Check for security holes, performance issues, and adherence to YAGNI/KISS/DRY. Return score (X/10), list of critical issues, warnings, and suggestions.", description="Review [phase]")
29
29
  ```
30
30
 
31
31
  ## Test Execution Phase
32
32
  ```
33
- Task(subagent_type="test-runner",
33
+ Agent(subagent_type="test-runner",
34
34
  prompt="Run tests for recently implemented code. Apply blast-radius scoping
35
35
  unless --full is requested. Return structured verdict with Status, Results,
36
36
  Coverage, Failures, and Action.",
@@ -40,7 +40,7 @@ Task(subagent_type="test-runner",
40
40
  ## Parallel Quality Gate (Step 4)
41
41
  Spawn both simultaneously — do NOT wait for one before starting the other:
42
42
  ```
43
- Task(subagent_type="test-runner", prompt="...", description="Test [feature]")
44
- Task(subagent_type="code-auditor", prompt="...", description="Review [feature]")
43
+ Agent(subagent_type="test-runner", prompt="...", description="Test [feature]")
44
+ Agent(subagent_type="code-auditor", prompt="...", description="Review [feature]")
45
45
  ```
46
46
  Wait for both results → apply quality-gate.md combine logic.
@@ -85,7 +85,7 @@ Interpret creatively and make unexpected choices that feel genuinely designed fo
85
85
 
86
86
  **Remember:** Claude is capable of extraordinary creative work. Don't hold back, show what can truly be created when thinking outside the box and committing fully to a distinctive vision.
87
87
 
88
- **Assets**: Generate images with `hapo:ai-multimodal`, process with `hapo:media-processing`
88
+ **Assets**: Analyze supplied visual assets with `hapo:ai-multimodal`; generate or process implementation assets with the project's existing image/CSS/build tooling and document output paths in the task report.
89
89
 
90
90
  ## Asset & Analysis References
91
91
 
@@ -1,11 +1,11 @@
1
1
  ---
2
2
  name: hapo:hotfix
3
- description: "ALWAYS activate this skill before fixing ANY bug, error, test failure, CI/CD issue, type error, lint error, log error, UI issue, or code problem. Structured 6-step bug-killing workflow with intelligent routing."
4
- argument-hint: "[issue] --quick|--parallel"
3
+ description: "ALWAYS activate this skill when you are asked to FIX a bug, error, test failure, CI/CD issue, type error, lint error, log error, UI issue, or code problem. Uses hapo:debug for evidence-first diagnosis before any code change."
4
+ argument-hint: "[issue] --quick|--parallel|--from-debug"
5
5
  version: "1.0.0"
6
6
  ---
7
7
 
8
- # Hotfix Structured Bug Elimination
8
+ # Hotfix - Structured Bug Elimination
9
9
 
10
10
  Kill bugs systematically. No guessing. Evidence first, fix second.
11
11
 
@@ -13,12 +13,15 @@ Kill bugs systematically. No guessing. Evidence first, fix second.
13
13
 
14
14
  - `--quick` - Fast track for trivial issues (lint, type errors, syntax)
15
15
  - `--parallel` - Spawn multiple `god-developer` agents for independent issues
16
+ - `--from-debug` - Start from an existing `hapo:debug` report and validate its root-cause contract
16
17
 
17
18
  Default: Autonomous mode — auto-approve when confidence is high.
18
19
 
19
20
  <HARD-GATE>
20
- Do NOT propose or implement fixes before completing Steps 1-2 (Scout + Diagnose).
21
+ Do NOT propose or implement fixes before completing Steps 1-2 (Scout + `hapo:debug` diagnosis).
21
22
  Symptom fixes are FAILURE. Find the root cause first.
23
+ The exact root-cause contract is mandatory: symptom, reproduction, expected/actual, root cause file:line or config/env source, why now, evidence chain, blast radius.
24
+ The side-effect gate is mandatory before completion.
22
25
  If 3+ fix attempts fail → STOP. Question the architecture. Discuss with user.
23
26
  Exception: `--quick` mode allows abbreviated scout→diagnose→fix for trivial issues.
24
27
  </HARD-GATE>
@@ -38,7 +41,7 @@ Exception: `--quick` mode allows abbreviated scout→diagnose→fix for trivial
38
41
  ```mermaid
39
42
  flowchart TD
40
43
  A[Issue Input] --> B[Step 1: Scout via hapo:inspect]
41
- B --> C[Step 2: Diagnose - Root Cause Analysis]
44
+ B --> C[Step 2: Diagnose via hapo:debug]
42
45
  C --> D[Step 3: Classify Complexity]
43
46
  D -->|Trivial| E[Quick Fix]
44
47
  D -->|Standard| F[Standard Fix]
@@ -49,7 +52,9 @@ flowchart TD
49
52
  G --> I
50
53
  H --> I
51
54
  I --> J[Step 5: Verify + Prevent]
52
- J -->|Pass| K[Step 6: Finalize]
55
+ J --> N[Side-Effect Gate]
56
+ N -->|Pass| K[Step 6: Finalize]
57
+ N -->|Regression Risk| C
53
58
  J -->|Fail, <3 attempts| C
54
59
  J -->|Fail, 3+ attempts| L[Question Architecture → Discuss with User]
55
60
  K --> M[Report + Commit]
@@ -81,11 +86,11 @@ flowchart TD
81
86
 
82
87
  ---
83
88
 
84
- ## Step 2: Diagnose (MANDATORY — never skip)
89
+ ## Step 2: Diagnose via `hapo:debug` (MANDATORY — never skip)
85
90
 
86
91
  **Purpose:** Evidence-based root cause analysis. NO guessing.
87
92
 
88
- See `references/diagnosis-protocol.md` for full methodology.
93
+ Use `hapo:debug` or validate an existing debug report when `--from-debug` is provided. See `references/diagnosis-protocol.md` for full methodology.
89
94
 
90
95
  **Mandatory chain:**
91
96
  1. **Capture pre-fix state:** Record exact error messages, failing test output, stack traces. This is your baseline for Step 5.
@@ -101,6 +106,15 @@ See `references/diagnosis-protocol.md` for full methodology.
101
106
  5. **Trace root cause:** Follow the chain backward — symptom → immediate cause → contributing factor → **ROOT CAUSE**.
102
107
  6. **Escalate:** If 2+ hypotheses fail, apply Inversion Thinking (see `references/escalation-tactics.md`).
103
108
 
109
+ **Exact root-cause contract:**
110
+ - Symptom: exact observable failure
111
+ - Reproduction: command, user flow, CI job, log trigger, or route
112
+ - Expected vs actual behavior
113
+ - Root cause: file:line, config, environment, dependency, or data source
114
+ - Why now: recent change, dependency drift, data state, environment, timing, or load
115
+ - Evidence chain: observations proving the cause
116
+ - Blast radius: affected files, modules, tests, users, workflows, or release paths
117
+
104
118
  **Output:** `✓ Step 2: Diagnosed — Root cause: [summary], Evidence: [brief], Scope: [N files]`
105
119
 
106
120
  ---
@@ -139,7 +153,7 @@ See `references/diagnosis-protocol.md` for full methodology.
139
153
  3. Run full test suite
140
154
 
141
155
  ### Deep Workflow
142
- 1. **Parallel investigation:** Launch Steps 1+2+3 concurrently — Scout (`hapo:inspect`), Diagnose (from error context), and Research (`researcher` subagent) all run simultaneously. See `references/parallel-patterns.md` Pattern E.
156
+ 1. **Parallel investigation:** After initial scope is known, gather evidence in parallel — Scout (`hapo:inspect`), Diagnose (`hapo:debug`), and Research (`researcher` subagent) can run concurrently. See `references/parallel-patterns.md` Pattern E.
143
157
  2. Synthesize findings from all three into a unified fix approach
144
158
  3. Plan the fix (consider writing to `references/` for future use)
145
159
  4. Implement in stages, verifying each stage
@@ -164,7 +178,8 @@ See `references/diagnosis-protocol.md` for full methodology.
164
178
  2. **Regression test:** The test MUST fail without the fix and pass with it.
165
179
  3. **Parallel verification:** Run typecheck + lint + build + test simultaneously via `Bash`. See `references/parallel-patterns.md` Pattern C.
166
180
  4. **Prevention guard (Standard+ only):** See `references/prevention-gate.md`.
167
- 5. **Code review:** Trigger `hapo:code-review` and handle results per `references/review-cycle.md` (autonomous auto-approve loop or HITL depending on fix criticality).
181
+ 5. **Side-effect gate:** Run the sweep in `references/debugger/side-effect-gate.md`.
182
+ 6. **Code review:** Trigger `hapo:code-review` and handle results per `references/review-cycle.md` (autonomous auto-approve loop or HITL depending on fix criticality).
168
183
 
169
184
  **If verification fails:**
170
185
  - < 3 attempts → Loop back to Step 2 (re-diagnose with new evidence)
@@ -201,6 +216,7 @@ Unified step markers (emit after each step):
201
216
 
202
217
  | Subagent | When |
203
218
  |----------|------|
219
+ | `hapo:debug` | Mandatory diagnosis gate before code edits (Step 2) |
204
220
  | `debugger` | Root cause unclear, need deep systematic investigation (Step 2) |
205
221
  | `Explore` (parallel) | Scout multiple areas (Step 1), test hypotheses (Step 2) |
206
222
  | `Bash` (parallel) | Verify: typecheck + lint + build + test (Step 5) |
@@ -220,3 +236,7 @@ Load as needed:
220
236
  - `references/review-cycle.md` — Autonomous approval loop + HITL review handling
221
237
  - `references/parallel-patterns.md` — Parallel Explore/Bash/Task coordination with code templates
222
238
  - `references/workflow-specialized.md` — CI/CD, test, TypeScript, UI-specific workflows
239
+ - `references/debugger/frontend-verification.md` — Browser/UI evidence and visual verification
240
+ - `references/debugger/performance-diagnostics.md` — Baseline-driven performance diagnosis
241
+ - `references/debugger/condition-based-waiting.md` — Flaky async test diagnosis
242
+ - `references/debugger/side-effect-gate.md` — Regression and blast-radius sweep
@@ -1,11 +1,13 @@
1
1
  # Diagnosis Protocol
2
2
 
3
- Structured root cause analysis methodology. Replaces ad-hoc guessing with evidence-based investigation.
3
+ Structured root cause analysis methodology. Replaces ad-hoc guessing with evidence-based investigation. Prefer running `hapo:debug` directly; use this file as the hotfix-local checklist.
4
4
 
5
5
  ## Core Principle
6
6
 
7
7
  **NEVER guess root causes.** Form hypotheses through structured reasoning and test them against evidence.
8
8
 
9
+ **NO FIXES IN DIAGNOSIS.** Product-code changes start only after the exact root-cause contract is complete.
10
+
9
11
  ## Pre-Diagnosis: Capture State (MANDATORY)
10
12
 
11
13
  Before any investigation, capture the current broken state as baseline:
@@ -58,9 +60,9 @@ Spawn parallel `Explore` subagents to test each hypothesis simultaneously:
58
60
 
59
61
  ```
60
62
  // Launch in SINGLE message — max 3 parallel agents
61
- Task("Explore", "Test hypothesis A: [specific search/check]")
62
- Task("Explore", "Test hypothesis B: [specific search/check]")
63
- Task("Explore", "Test hypothesis C: [specific search/check]")
63
+ Agent(subagent_type="Explore", prompt="Test hypothesis A: [specific search/check]")
64
+ Agent(subagent_type="Explore", prompt="Test hypothesis B: [specific search/check]")
65
+ Agent(subagent_type="Explore", prompt="Test hypothesis C: [specific search/check]")
64
66
  ```
65
67
 
66
68
  **For each hypothesis result:**
@@ -81,6 +83,19 @@ Symptom (where error appears)
81
83
 
82
84
  **Rule:** NEVER fix where the error appears. Trace back to the source.
83
85
 
86
+ ## Exact Root-Cause Contract (MANDATORY)
87
+
88
+ Before Step 4 implementation in `hapo:hotfix`, record:
89
+
90
+ - Symptom: exact observable failure
91
+ - Reproduction: command, user flow, CI job, log trigger, or route
92
+ - Expected: intended behavior
93
+ - Actual: observed behavior
94
+ - Root cause: file:line, config, environment, dependency, or data source
95
+ - Why now: recent change, data state, dependency drift, environment, timing, or load factor
96
+ - Evidence chain: observations proving this cause
97
+ - Blast radius: affected files, modules, tests, users, workflows, or release paths
98
+
84
99
  ### Phase 5: Escalate — When hypotheses fail
85
100
 
86
101
  If 2+ hypotheses are REFUTED → see `escalation-tactics.md`.
@@ -96,6 +111,15 @@ If 2+ hypotheses are REFUTED → see `escalation-tactics.md`.
96
111
  ### Root Cause
97
112
  [Clear explanation traced back to origin]
98
113
 
114
+ ### Exact Root-Cause Contract
115
+ - Symptom:
116
+ - Reproduction:
117
+ - Expected:
118
+ - Actual:
119
+ - Root cause:
120
+ - Why now:
121
+ - Blast radius:
122
+
99
123
  ### Evidence Chain
100
124
  1. [Observation] → led to hypothesis [X]
101
125
  2. [Test result] → confirmed/refuted [X]
@@ -18,9 +18,9 @@ When you need to understand multiple areas of the codebase before diagnosing:
18
18
 
19
19
  ```
20
20
  // Spawn in a SINGLE message — agents run concurrently
21
- Task("Explore", "Scan src/auth/ for token validation logic and recent changes")
22
- Task("Explore", "Scan src/middleware/ for request interceptors that touch headers")
23
- Task("Explore", "Find all test files matching *auth*.test.* and check coverage")
21
+ Agent(subagent_type="Explore", prompt="Scan src/auth/ for token validation logic and recent changes")
22
+ Agent(subagent_type="Explore", prompt="Scan src/middleware/ for request interceptors that touch headers")
23
+ Agent(subagent_type="Explore", prompt="Find all test files matching *auth*.test.* and check coverage")
24
24
  ```
25
25
 
26
26
  Wait for all agents to return. Merge their findings into a unified context map before proceeding to diagnosis.
@@ -30,9 +30,9 @@ Wait for all agents to return. Merge their findings into a unified context map b
30
30
  After forming 2-3 hypotheses in Step 2 (Diagnose), test them concurrently:
31
31
 
32
32
  ```
33
- Task("Explore", "Verify hypothesis: cache returns stale data — check TTL config in src/cache/")
34
- Task("Explore", "Verify hypothesis: race condition in login flow — trace async calls in src/auth/login.ts")
35
- Task("Explore", "Verify hypothesis: env var missing in production — check .env.example vs deployed config")
33
+ Agent(subagent_type="Explore", prompt="Verify hypothesis: cache returns stale data — check TTL config in src/cache/")
34
+ Agent(subagent_type="Explore", prompt="Verify hypothesis: race condition in login flow — trace async calls in src/auth/login.ts")
35
+ Agent(subagent_type="Explore", prompt="Verify hypothesis: env var missing in production — check .env.example vs deployed config")
36
36
  ```
37
37
 
38
38
  Each agent returns CONFIRMED, REFUTED, or INCONCLUSIVE with evidence.
@@ -42,10 +42,10 @@ Each agent returns CONFIRMED, REFUTED, or INCONCLUSIVE with evidence.
42
42
  After implementing a fix, validate from every angle simultaneously:
43
43
 
44
44
  ```
45
- Task("Bash", "npx tsc --noEmit") // Typecheck
46
- Task("Bash", "npx eslint src/ --quiet") // Lint
47
- Task("Bash", "npm run build") // Build
48
- Task("Bash", "npm test -- --bail") // Tests
45
+ Bash: npx tsc --noEmit // Typecheck
46
+ Bash: npx eslint src/ --quiet // Lint
47
+ Bash: npm run build // Build
48
+ Bash: npm test -- --bail // Tests
49
49
  ```
50
50
 
51
51
  All four must pass. If any fails, investigate that specific failure before re-attempting.
@@ -79,10 +79,10 @@ In complex bugs (Deep workflow), Steps 1+2+3 should run **concurrently** to save
79
79
 
80
80
  ```
81
81
  // All three launch simultaneously:
82
- Task("Explore", "Scout: map affected files, dependencies, and test coverage")
82
+ Agent(subagent_type="Explore", prompt="Scout: map affected files, dependencies, and test coverage")
83
83
  // Meanwhile, main agent starts diagnosis using available error context
84
84
  // Meanwhile:
85
- Task("researcher", "Research: find latest docs and known issues for [library/framework]")
85
+ Agent(subagent_type="researcher", prompt="Research: find latest docs and known issues for [library/framework]")
86
86
  ```
87
87
 
88
88
  The main agent begins hypothesis formation (Step 2) immediately using the error message and stack trace. Scout results enrich the diagnosis when they arrive. Research results inform the fix approach.
@@ -98,6 +98,6 @@ The main agent begins hypothesis formation (Step 2) immediately using the error
98
98
 
99
99
  ## Fallback: When Task Tools Are Unavailable
100
100
 
101
- `TaskCreate`/`TaskUpdate` are CLI-only they error in some IDE extensions. If they fail:
101
+ `TaskCreate`/`TaskUpdate` are task-list tools and can be unavailable in some runtimes. If they fail:
102
102
  - Track progress manually using markdown checklists
103
103
  - The fix workflow itself remains fully functional — Tasks add visibility, not core logic
@@ -1,6 +1,6 @@
1
1
  # Prevention Gate
2
2
 
3
- After a fix is verified, apply defense-in-depth to prevent the same bug class from recurring.
3
+ After a fix is verified, apply defense-in-depth to prevent the same bug class from recurring. Prevention is not the same as side-effect safety; run `references/debugger/side-effect-gate.md` before claiming completion.
4
4
 
5
5
  ## Mandatory Prevention Checklist
6
6
 
@@ -29,6 +29,13 @@ If the bug could recur silently:
29
29
  - Include structured context (IDs, timestamps, relevant state)
30
30
  - Ensure the log level is appropriate (warn for recoverable, error for critical)
31
31
 
32
+ ### 5. Layered Defense Guard
33
+ If the same invalid state could enter through multiple paths:
34
+ - Validate at the entry point where data first crosses a trust boundary
35
+ - Validate in business logic where invariants must hold
36
+ - Add environment/config guards when deployment conditions matter
37
+ - Add diagnostic logging only at decision points that help future debugging
38
+
32
39
  ## Quick Mode Prevention
33
40
 
34
41
  For trivial fixes (lint, type errors), prevention is optional but encouraged:
@@ -27,6 +27,7 @@ When unit/integration/e2e tests are failing:
27
27
  2. **Check for pollution:** Does it pass alone but fail in suite? → Test order dependency or shared state leaking
28
28
  3. **Check for staleness:** Did the implementation change but the test assertions were not updated?
29
29
  4. **Snapshot drift:** If snapshot tests fail, review the diff carefully — is it an intentional change or a regression?
30
+ 5. **Flaky async:** Replace arbitrary sleeps with condition-based waits. See `references/debugger/condition-based-waiting.md`.
30
31
 
31
32
  ---
32
33
 
@@ -51,7 +52,8 @@ When the interface is broken, misaligned, or not rendering:
51
52
  1. **Capture visual evidence:** Use `pushd skills/chrome-devtools/scripts && node screenshot.js --url <url> && popd`
52
53
  2. **Check ARIA structure:** `pushd skills/chrome-devtools/scripts && node aria-snapshot.js --url <url> && popd` — reveals hidden overlays, z-index battles
53
54
  3. **Check console errors:** `pushd skills/chrome-devtools/scripts && node console.js --url <url> && popd` — catches JS crashes preventing render
54
- 4. **Common traps:**
55
+ 4. **Run frontend verification:** Follow `references/debugger/frontend-verification.md` for screenshot, console, network, accessibility, responsive, and interaction evidence.
56
+ 5. **Common traps:**
55
57
  - CSS specificity wars (use browser DevTools or ARIA snapshot to verify computed styles)
56
58
  - Hydration mismatch in SSR frameworks (server HTML differs from client render)
57
59
  - Missing responsive breakpoints (test at multiple viewport widths)
@@ -37,7 +37,7 @@ Instead of rejecting, use a **2-phase approach**: first run a lightweight Struct
37
37
 
38
38
  Spawn **one dedicated scout agent** to map the top-level structure before any work is divided:
39
39
 
40
- 1. **Discover top-level layout** - Use `Glob` / `LS` to list immediate children of the scope root:
40
+ 1. **Discover top-level layout** - Use `Glob` or `Bash` `ls` to list immediate children of the scope root:
41
41
  - Top-level directories (src/, apps/, backend/, frontend/, packages/, etc.)
42
42
  - Key config files (README.md, package.json, tsconfig.json, pyproject.toml, go.mod, etc.)
43
43
  - Monorepo markers (packages/*, apps/*, lerna.json, pnpm-workspace.yaml, turbo.json)
@@ -113,7 +113,7 @@ Follow-up: "Want to investigate deeper? Choose: backend API | frontend component
113
113
 
114
114
  ## Configuration
115
115
 
116
- Read from `packages/spec/src/claude/runtime.json`:
116
+ Read from `.claude/runtime.json`:
117
117
  ```json
118
118
  {
119
119
  "gemini": {