npm - @haposoft/cafekit - Versions diffs - 0.8.1 → 0.8.3 - Mend

@haposoft/cafekit 0.8.1 → 0.8.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (40) hide show

package/src/claude/skills/chrome-devtools/SKILL.md CHANGED Viewed

@@ -291,7 +291,7 @@ node $SKILL_DIR/packages/spec/src/claude/chrome-devtools/tmp/login-test.js
 Skills can exist in **project-scope** or **user-scope**. Priority: project-scope > user-scope.
-**IMPORTANT:** Invoke "/hapo:project-organization" skill to organize the outputs.
+**IMPORTANT:** Store browser artifacts in a project-local screenshots or reports folder and include exact output paths in the final report.
 Store screenshots for analysis in `<project>/packages/spec/src/claude/chrome-devtools/screenshots/`:

package/src/claude/skills/code-review/SKILL.md CHANGED Viewed

@@ -51,7 +51,7 @@ Does the code match what was requested?
 ### Stage 3 — Adversarial Review (Red-Team)
 Actively try to break the code.
-- **Edge Case Scouting:** If the Pull Request modifies >= 5 files, invoke the `inspect` agent to scout the codebase to see where modified functions/components are imported and if boundary errors exist before finishing the review.
+- **Edge Case Scouting:** If the Pull Request modifies >= 5 files, activate `hapo:inspect` or call the `inspector` agent to scout where modified functions/components are imported and whether boundary errors exist before finishing the review.
 - Find security holes (XSS, SQL Injection, Hardcoded tokens, Exposed Secrets).
 - Find false assumptions, resource exhaustion loops, and race conditions.
 - Find unhandled edge cases (e.g. empty strings, null pointers, negative integers).

package/src/claude/skills/debug/SKILL.md ADDED Viewed

@@ -0,0 +1,216 @@
+---
+name: hapo:debug
+description: "Use before fixing any bug, failing test, CI/CD failure, production incident, performance issue, UI regression, flaky test, or unexpected behavior. Diagnostic-only root-cause workflow with evidence, hypotheses, blast-radius mapping, and verification plan."
+argument-hint: "[issue] --quick|--ci|--frontend|--perf"
+version: "1.0.0"
+---
+# Debug - Evidence-First Root Cause Analysis
+Debugging is diagnosis, not repair. Find the source of the failure before changing product code.
+## Arguments
+- `--quick` - Abbreviated path for syntax, lint, type, or single-test failures with obvious local scope
+- `--ci` - Focus on CI/CD logs, runner environment, dependency versions, and pipeline setup
+- `--frontend` - Include browser console, screenshot, accessibility tree, network, and responsive checks
+- `--perf` - Include baseline measurements, bottleneck layer, profiling, and before/after targets
+Default: systematic diagnosis with no product-code edits.
+<HARD-GATE>
+Do NOT implement a fix inside `hapo:debug`.
+Do NOT recommend a fix until the root-cause contract is complete.
+Do NOT stop at the first plausible explanation. Test hypotheses against evidence.
+If 2+ hypotheses are refuted, change strategy before continuing.
+If evidence is insufficient, report `Root cause: unknown` with the missing evidence needed.
+</HARD-GATE>
+Temporary instrumentation is allowed only when it is the minimal way to observe hidden state. Remove it before finishing and report what was instrumented.
+## Process Flow
+```mermaid
+flowchart TD
+    A[Issue Input] --> B[Step 1: Scout via hapo:inspect]
+    B --> C[Step 2: Capture Evidence]
+    C --> D[Step 3: Pattern Analysis]
+    D --> E[Step 4: Hypothesis Tests]
+    E --> F[Step 5: Root Cause Trace]
+    F --> G[Step 6: Blast Radius + Verification Plan]
+    G --> H[Diagnostic Report]
+    H --> I{Fix requested?}
+    I -->|Yes| J[Hand off to hapo:hotfix]
+    I -->|No| K[Stop after diagnosis]
+```
+**This diagram is the authoritative workflow.** `hapo:debug` stops at diagnosis unless the user explicitly asks to fix.
+---
+## Step 1: Scout
+Understand the affected code before forming hypotheses.
+**Action:** Activate `hapo:inspect` for the relevant scope.
+**Checklist:**
+- [ ] Affected files and modules identified
+- [ ] Direct dependencies and call paths mapped
+- [ ] Related tests located
+- [ ] Recent changes checked: `git log --oneline -10 -- <affected-files>`
+- [ ] Existing working examples or adjacent patterns identified
+**Output:** `✓ Step 1: Scouted - [N] files, [M] deps, [K] tests`
+---
+## Step 2: Capture Evidence
+Create a baseline that can later prove whether the issue changed.
+**Capture:**
+- Exact command, URL, user flow, or trigger
+- Exact error message, stack trace, failing assertion, or visual symptom
+- Expected vs actual behavior
+- Relevant logs with timestamps
+- Environment facts: runtime, dependency versions, OS, browser, CI runner, config
+- Whether the issue reproduces consistently or intermittently
+For frontend issues, use `references/debugger/frontend-verification.md`.
+For CI/log issues, use `references/debugger/log-ci-analysis.md`.
+For performance issues, use `references/debugger/performance-diagnostics.md`.
+**Output:** `✓ Step 2: Evidence captured - baseline command/symptom recorded`
+---
+## Step 3: Pattern Analysis
+Before proposing a cause, compare against known-good patterns.
+**Check:**
+- Similar implementation that works
+- Similar tests that pass
+- Recent code that changed the same contract
+- Config/env differences between passing and failing contexts
+- Dependency/API contract changes
+**Output:** `✓ Step 3: Patterns compared - [working reference] vs [failing path]`
+---
+## Step 4: Hypothesis Tests
+Create 2-3 competing hypotheses. Test one variable at a time.
+```text
+Hypothesis: [statement]
+Confirm if: [evidence that proves it]
+Refute if: [evidence that disproves it]
+Quick test: [command/search/log/query]
+Result: confirmed | refuted | inconclusive
+```
+Rules:
+- Never batch unrelated changes as a test.
+- Prefer read-only evidence: logs, grep, stack traces, DB queries, browser traces.
+- For flaky async tests, use `references/debugger/condition-based-waiting.md`.
+- If 2+ hypotheses are refuted, use inversion: ask what evidence would make the current explanation impossible.
+**Output:** `✓ Step 4: Hypotheses tested - [confirmed/refuted counts]`
+---
+## Step 5: Root Cause Trace
+Trace backward from symptom to origin.
+```text
+Symptom
+  <- immediate cause
+    <- contributing factor
+      <- ROOT CAUSE
+```
+**Exact root-cause contract:**
+- Symptom: exact observable failure
+- Reproduction: command/user flow/log trigger
+- Expected vs actual behavior
+- Root cause: file:line or config/env source
+- Why now: recent change, data state, dependency, environment, timing, or load factor
+- Evidence chain: observations that prove this cause
+- Blast radius: files/modules/tests/users/workflows affected
+**Output:** `✓ Step 5: Root cause traced - [file:line/config/env]`
+---
+## Step 6: Blast Radius + Verification Plan
+Prepare the handoff to `hapo:hotfix` or the user.
+**Verification plan must include:**
+- Original failing command or reproduction path
+- Targeted regression test or scenario
+- Affected-module tests
+- Typecheck/lint/build commands when relevant
+- UI screenshot/console/network checks when relevant
+- Side-effect sweep from `references/debugger/side-effect-gate.md`
+**Output:** `✓ Step 6: Verification planned - [commands/scenarios]`
+---
+## Diagnostic Report Format
+```markdown
+## Debug Report
+**Issue:** [one-line summary]
+**Mode:** quick | standard | ci | frontend | perf
+**Root cause confidence:** high | medium | low | unknown
+### Root Cause Contract
+- Symptom:
+- Reproduction:
+- Expected:
+- Actual:
+- Root cause:
+- Why now:
+- Evidence chain:
+- Blast radius:
+### Hypotheses Tested
+1. [confirmed/refuted/inconclusive] [hypothesis] - [evidence]
+### Verification Plan
+- Original reproduction:
+- Regression guard:
+- Side-effect sweep:
+### Recommended Fix Direction
+[Smallest root-cause fix, or "insufficient evidence"]
+### Unresolved Questions
+- [Only if any]
+```
+## Relationship To Hotfix
+- Use `hapo:debug` to determine what is wrong.
+- Use `hapo:hotfix` to change code after the root-cause contract is complete.
+- If `hapo:hotfix` verification fails, return to `hapo:debug` with the new evidence.
+## References
+Load as needed:
+- `references/debugger/core-philosophy.md` - Anti-guessing discipline
+- `references/debugger/root-cause-tracing.md` - Backward trace to origin
+- `references/debugger/verification-protocol.md` - Fresh evidence requirements
+- `references/debugger/log-ci-analysis.md` - Logs and CI/CD failure analysis
+- `references/debugger/parallel-agent-hydration.md` - Parallel reconnaissance
+- `references/debugger/frontend-verification.md` - Browser/UI verification
+- `references/debugger/performance-diagnostics.md` - Performance investigation
+- `references/debugger/condition-based-waiting.md` - Flaky async test diagnosis
+- `references/debugger/side-effect-gate.md` - Regression and blast-radius checks

package/src/claude/skills/develop/SKILL.md CHANGED Viewed

@@ -94,7 +94,7 @@ flowchart TD
 - Before coding, set the active task(s) to `in_progress` in both markdown and `spec.json.task_registry`, or route through `/hapo:sync` if the runtime expects the sync protocol.
 ### Step 2: Scout (Codebase Inspection)
-- **Mandatory:** Call agent `Task(subagent_type="inspector", ...)` to scan the overall codebase structure (e.g., where components live, where utils are). Avoid wandering into forbidden zones.
+- **Mandatory:** Call agent `Agent(subagent_type="inspector", ...)` to scan the overall codebase structure (e.g., where components live, where utils are). Avoid wandering into forbidden zones. Use the legacy `Task` tool only in runtimes that have not renamed the subagent tool yet.
 ### Step 3: Implement Code
 - Act as `god-developer` OR directly write code, executing tasks specified in the loaded Markdown file(s) sequentially.

package/src/claude/skills/develop/references/quality-gate.md CHANGED Viewed

@@ -35,11 +35,11 @@ START_LOOP:
   ---------------------------------------------------------------
   PARALLEL GATE: Spawn BOTH agents simultaneously
   ---------------------------------------------------------------
-  → Task(subagent_type="test-runner",
+  → Agent(subagent_type="test-runner",
         prompt="Run task-aware verification for the recently implemented code. Read the active task file(s) and execute the exact verification commands named there first, in order. Preflight compile/typecheck/build failures must be reported as PRECHECK_FAIL and take precedence over NO_TESTS. After that, run any additional repo-level typecheck/test/build checks needed for confidence. Inspect named artifacts/runtime outputs. For multi-service tasks, verify the flow does not rely on process-local stand-ins masquerading as shared state. Return PASS only if automated checks and task evidence both pass. Mark anything unexecuted as UNVERIFIED. Treat NO_TESTS as non-passing unless the task did not require a dedicated test suite.",
         description="Test [feature]")
-  → Task(subagent_type="code-auditor",
+  → Agent(subagent_type="code-auditor",
         prompt="Review all recently written code against the active task file(s), referenced requirements, and design contracts. Missing deliverables, placeholder-only wiring, missing runtime entrypoints, overscope edits outside the task packet, silent replacement of named technologies/contracts, or fake cross-service proof via process-local state are Critical even if build/tests pass. Check security, logic, architecture, YAGNI/KISS/DRY. Return score (X/10), critical count, warning list, and evidence gaps.",
         description="Review [feature]")
@@ -73,7 +73,7 @@ REVIEW_ONLY:
   ---------------------------------------------------------------
   Re-run ONLY code-auditor (tests already passed and no new evidence-producing code changed)
   ---------------------------------------------------------------
-  → Task(subagent_type="code-auditor", ...)
+  → Agent(subagent_type="code-auditor", ...)
   IF Score >= 9.5 AND Critical = 0 → PASS!
   IF Score < 9.5 OR Critical > 0:

package/src/claude/skills/develop/references/subagent-patterns.md CHANGED Viewed

@@ -2,35 +2,35 @@
 Standard prompt templates for delegating work to subagents within the develop workflow.
-## Task Tool Pattern
-Use when the Task tool is available in the environment:
+## Agent Tool Pattern
+Use the current Claude Code `Agent` tool for subagent invocation. In older runtimes, the same shape may appear as legacy `Task`.
 ```
-Task(subagent_type="[agent-name]", prompt="[task description]", description="[short title]")
+Agent(subagent_type="[agent-name]", prompt="[task description]", description="[short title]")
 ```
 ## Codebase Inspection Phase
 ```
-Task(subagent_type="inspector", prompt="Scan and identify all files related to [feature-name] in the current codebase.", description="Scout [feature-name]")
+Agent(subagent_type="inspector", prompt="Scan and identify all files related to [feature-name] in the current codebase.", description="Scout [feature-name]")
 ```
 ## Code Implementation Phase
 ```
-Task(subagent_type="god-developer", prompt="Implement the sub-tasks from [tasks-directory] based on the specification in [spec.json]", description="Code Feature [feature]")
+Agent(subagent_type="god-developer", prompt="Implement the sub-tasks from [tasks-directory] based on the specification in [spec.json]", description="Code Feature [feature]")
 ```
 ## UI Implementation Phase
 ```
-Task(subagent_type="ui-ux-designer", prompt="Implement the frontend code for [feature] following ./docs/design-guidelines.md", description="Code UI [feature]")
+Agent(subagent_type="ui-ux-designer", prompt="Implement the frontend code for [feature] following ./docs/design-guidelines.md", description="Code UI [feature]")
 ```
 ## Code Review Phase
 ```
-Task(subagent_type="code-auditor", prompt="Review all recently written code. Check for security holes, performance issues, and adherence to YAGNI/KISS/DRY. Return score (X/10), list of critical issues, warnings, and suggestions.", description="Review [phase]")
+Agent(subagent_type="code-auditor", prompt="Review all recently written code. Check for security holes, performance issues, and adherence to YAGNI/KISS/DRY. Return score (X/10), list of critical issues, warnings, and suggestions.", description="Review [phase]")
 ```
 ## Test Execution Phase
 ```
-Task(subagent_type="test-runner",
+Agent(subagent_type="test-runner",
   prompt="Run tests for recently implemented code. Apply blast-radius scoping
     unless --full is requested. Return structured verdict with Status, Results,
     Coverage, Failures, and Action.",
@@ -40,7 +40,7 @@ Task(subagent_type="test-runner",
 ## Parallel Quality Gate (Step 4)
 Spawn both simultaneously — do NOT wait for one before starting the other:
 ```
-Task(subagent_type="test-runner",  prompt="...", description="Test [feature]")
-Task(subagent_type="code-auditor", prompt="...", description="Review [feature]")
+Agent(subagent_type="test-runner",  prompt="...", description="Test [feature]")
+Agent(subagent_type="code-auditor", prompt="...", description="Review [feature]")
 ```
 Wait for both results → apply quality-gate.md combine logic.

package/src/claude/skills/frontend-design/SKILL.md CHANGED Viewed

@@ -85,7 +85,7 @@ Interpret creatively and make unexpected choices that feel genuinely designed fo
 **Remember:** Claude is capable of extraordinary creative work. Don't hold back, show what can truly be created when thinking outside the box and committing fully to a distinctive vision.
-**Assets**: Generate images with `hapo:ai-multimodal`, process with `hapo:media-processing`
+**Assets**: Analyze supplied visual assets with `hapo:ai-multimodal`; generate or process implementation assets with the project's existing image/CSS/build tooling and document output paths in the task report.
 ## Asset & Analysis References

package/src/claude/skills/hotfix/SKILL.md CHANGED Viewed

@@ -1,11 +1,11 @@
 ---
 name: hapo:hotfix
-description: "ALWAYS activate this skill before fixing ANY bug, error, test failure, CI/CD issue, type error, lint error, log error, UI issue, or code problem. Structured 6-step bug-killing workflow with intelligent routing."
-argument-hint: "[issue] --quick|--parallel"
+description: "ALWAYS activate this skill when you are asked to FIX a bug, error, test failure, CI/CD issue, type error, lint error, log error, UI issue, or code problem. Uses hapo:debug for evidence-first diagnosis before any code change."
+argument-hint: "[issue] --quick|--parallel|--from-debug"
 version: "1.0.0"
 ---
-# Hotfix — Structured Bug Elimination
+# Hotfix - Structured Bug Elimination
 Kill bugs systematically. No guessing. Evidence first, fix second.
@@ -13,12 +13,15 @@ Kill bugs systematically. No guessing. Evidence first, fix second.
 - `--quick` - Fast track for trivial issues (lint, type errors, syntax)
 - `--parallel` - Spawn multiple `god-developer` agents for independent issues
+- `--from-debug` - Start from an existing `hapo:debug` report and validate its root-cause contract
 Default: Autonomous mode — auto-approve when confidence is high.
 <HARD-GATE>
-Do NOT propose or implement fixes before completing Steps 1-2 (Scout + Diagnose).
+Do NOT propose or implement fixes before completing Steps 1-2 (Scout + `hapo:debug` diagnosis).
 Symptom fixes are FAILURE. Find the root cause first.
+The exact root-cause contract is mandatory: symptom, reproduction, expected/actual, root cause file:line or config/env source, why now, evidence chain, blast radius.
+The side-effect gate is mandatory before completion.
 If 3+ fix attempts fail → STOP. Question the architecture. Discuss with user.
 Exception: `--quick` mode allows abbreviated scout→diagnose→fix for trivial issues.
 </HARD-GATE>
@@ -38,7 +41,7 @@ Exception: `--quick` mode allows abbreviated scout→diagnose→fix for trivial
 ```mermaid
 flowchart TD
     A[Issue Input] --> B[Step 1: Scout via hapo:inspect]
-    B --> C[Step 2: Diagnose - Root Cause Analysis]
+    B --> C[Step 2: Diagnose via hapo:debug]
     C --> D[Step 3: Classify Complexity]
     D -->|Trivial| E[Quick Fix]
     D -->|Standard| F[Standard Fix]
@@ -49,7 +52,9 @@ flowchart TD
     G --> I
     H --> I
     I --> J[Step 5: Verify + Prevent]
-    J -->|Pass| K[Step 6: Finalize]
+    J --> N[Side-Effect Gate]
+    N -->|Pass| K[Step 6: Finalize]
+    N -->|Regression Risk| C
     J -->|Fail, <3 attempts| C
     J -->|Fail, 3+ attempts| L[Question Architecture → Discuss with User]
     K --> M[Report + Commit]
@@ -81,11 +86,11 @@ flowchart TD
 ---
-## Step 2: Diagnose (MANDATORY — never skip)
+## Step 2: Diagnose via `hapo:debug` (MANDATORY — never skip)
 **Purpose:** Evidence-based root cause analysis. NO guessing.
-See `references/diagnosis-protocol.md` for full methodology.
+Use `hapo:debug` or validate an existing debug report when `--from-debug` is provided. See `references/diagnosis-protocol.md` for full methodology.
 **Mandatory chain:**
 1. **Capture pre-fix state:** Record exact error messages, failing test output, stack traces. This is your baseline for Step 5.
@@ -101,6 +106,15 @@ See `references/diagnosis-protocol.md` for full methodology.
 5. **Trace root cause:** Follow the chain backward — symptom → immediate cause → contributing factor → **ROOT CAUSE**.
 6. **Escalate:** If 2+ hypotheses fail, apply Inversion Thinking (see `references/escalation-tactics.md`).
+**Exact root-cause contract:**
+- Symptom: exact observable failure
+- Reproduction: command, user flow, CI job, log trigger, or route
+- Expected vs actual behavior
+- Root cause: file:line, config, environment, dependency, or data source
+- Why now: recent change, dependency drift, data state, environment, timing, or load
+- Evidence chain: observations proving the cause
+- Blast radius: affected files, modules, tests, users, workflows, or release paths
 **Output:** `✓ Step 2: Diagnosed — Root cause: [summary], Evidence: [brief], Scope: [N files]`
 ---
@@ -139,7 +153,7 @@ See `references/diagnosis-protocol.md` for full methodology.
 3. Run full test suite
 ### Deep Workflow
-1. **Parallel investigation:** Launch Steps 1+2+3 concurrently — Scout (`hapo:inspect`), Diagnose (from error context), and Research (`researcher` subagent) all run simultaneously. See `references/parallel-patterns.md` Pattern E.
+1. **Parallel investigation:** After initial scope is known, gather evidence in parallel — Scout (`hapo:inspect`), Diagnose (`hapo:debug`), and Research (`researcher` subagent) can run concurrently. See `references/parallel-patterns.md` Pattern E.
 2. Synthesize findings from all three into a unified fix approach
 3. Plan the fix (consider writing to `references/` for future use)
 4. Implement in stages, verifying each stage
@@ -164,7 +178,8 @@ See `references/diagnosis-protocol.md` for full methodology.
 2. **Regression test:** The test MUST fail without the fix and pass with it.
 3. **Parallel verification:** Run typecheck + lint + build + test simultaneously via `Bash`. See `references/parallel-patterns.md` Pattern C.
 4. **Prevention guard (Standard+ only):** See `references/prevention-gate.md`.
-5. **Code review:** Trigger `hapo:code-review` and handle results per `references/review-cycle.md` (autonomous auto-approve loop or HITL depending on fix criticality).
+5. **Side-effect gate:** Run the sweep in `references/debugger/side-effect-gate.md`.
+6. **Code review:** Trigger `hapo:code-review` and handle results per `references/review-cycle.md` (autonomous auto-approve loop or HITL depending on fix criticality).
 **If verification fails:**
 - < 3 attempts → Loop back to Step 2 (re-diagnose with new evidence)
@@ -201,6 +216,7 @@ Unified step markers (emit after each step):
 | Subagent | When |
 |----------|------|
+| `hapo:debug` | Mandatory diagnosis gate before code edits (Step 2) |
 | `debugger` | Root cause unclear, need deep systematic investigation (Step 2) |
 | `Explore` (parallel) | Scout multiple areas (Step 1), test hypotheses (Step 2) |
 | `Bash` (parallel) | Verify: typecheck + lint + build + test (Step 5) |
@@ -220,3 +236,7 @@ Load as needed:
 - `references/review-cycle.md` — Autonomous approval loop + HITL review handling
 - `references/parallel-patterns.md` — Parallel Explore/Bash/Task coordination with code templates
 - `references/workflow-specialized.md` — CI/CD, test, TypeScript, UI-specific workflows
+- `references/debugger/frontend-verification.md` — Browser/UI evidence and visual verification
+- `references/debugger/performance-diagnostics.md` — Baseline-driven performance diagnosis
+- `references/debugger/condition-based-waiting.md` — Flaky async test diagnosis
+- `references/debugger/side-effect-gate.md` — Regression and blast-radius sweep

package/src/claude/skills/hotfix/references/diagnosis-protocol.md CHANGED Viewed

@@ -1,11 +1,13 @@
 # Diagnosis Protocol
-Structured root cause analysis methodology. Replaces ad-hoc guessing with evidence-based investigation.
+Structured root cause analysis methodology. Replaces ad-hoc guessing with evidence-based investigation. Prefer running `hapo:debug` directly; use this file as the hotfix-local checklist.
 ## Core Principle
 **NEVER guess root causes.** Form hypotheses through structured reasoning and test them against evidence.
+**NO FIXES IN DIAGNOSIS.** Product-code changes start only after the exact root-cause contract is complete.
 ## Pre-Diagnosis: Capture State (MANDATORY)
 Before any investigation, capture the current broken state as baseline:
@@ -58,9 +60,9 @@ Spawn parallel `Explore` subagents to test each hypothesis simultaneously:
 ```
 // Launch in SINGLE message — max 3 parallel agents
-Task("Explore", "Test hypothesis A: [specific search/check]")
-Task("Explore", "Test hypothesis B: [specific search/check]")
-Task("Explore", "Test hypothesis C: [specific search/check]")
+Agent(subagent_type="Explore", prompt="Test hypothesis A: [specific search/check]")
+Agent(subagent_type="Explore", prompt="Test hypothesis B: [specific search/check]")
+Agent(subagent_type="Explore", prompt="Test hypothesis C: [specific search/check]")
 ```
 **For each hypothesis result:**
@@ -81,6 +83,19 @@ Symptom (where error appears)
 **Rule:** NEVER fix where the error appears. Trace back to the source.
+## Exact Root-Cause Contract (MANDATORY)
+Before Step 4 implementation in `hapo:hotfix`, record:
+- Symptom: exact observable failure
+- Reproduction: command, user flow, CI job, log trigger, or route
+- Expected: intended behavior
+- Actual: observed behavior
+- Root cause: file:line, config, environment, dependency, or data source
+- Why now: recent change, data state, dependency drift, environment, timing, or load factor
+- Evidence chain: observations proving this cause
+- Blast radius: affected files, modules, tests, users, workflows, or release paths
 ### Phase 5: Escalate — When hypotheses fail
 If 2+ hypotheses are REFUTED → see `escalation-tactics.md`.
@@ -96,6 +111,15 @@ If 2+ hypotheses are REFUTED → see `escalation-tactics.md`.
 ### Root Cause
 [Clear explanation traced back to origin]
+### Exact Root-Cause Contract
+- Symptom:
+- Reproduction:
+- Expected:
+- Actual:
+- Root cause:
+- Why now:
+- Blast radius:
 ### Evidence Chain
 1. [Observation] → led to hypothesis [X]
 2. [Test result] → confirmed/refuted [X]

package/src/claude/skills/hotfix/references/parallel-patterns.md CHANGED Viewed

@@ -18,9 +18,9 @@ When you need to understand multiple areas of the codebase before diagnosing:
 ```
 // Spawn in a SINGLE message — agents run concurrently
-Task("Explore", "Scan src/auth/ for token validation logic and recent changes")
-Task("Explore", "Scan src/middleware/ for request interceptors that touch headers")
-Task("Explore", "Find all test files matching *auth*.test.* and check coverage")
+Agent(subagent_type="Explore", prompt="Scan src/auth/ for token validation logic and recent changes")
+Agent(subagent_type="Explore", prompt="Scan src/middleware/ for request interceptors that touch headers")
+Agent(subagent_type="Explore", prompt="Find all test files matching *auth*.test.* and check coverage")
 ```
 Wait for all agents to return. Merge their findings into a unified context map before proceeding to diagnosis.
@@ -30,9 +30,9 @@ Wait for all agents to return. Merge their findings into a unified context map b
 After forming 2-3 hypotheses in Step 2 (Diagnose), test them concurrently:
 ```
-Task("Explore", "Verify hypothesis: cache returns stale data — check TTL config in src/cache/")
-Task("Explore", "Verify hypothesis: race condition in login flow — trace async calls in src/auth/login.ts")
-Task("Explore", "Verify hypothesis: env var missing in production — check .env.example vs deployed config")
+Agent(subagent_type="Explore", prompt="Verify hypothesis: cache returns stale data — check TTL config in src/cache/")
+Agent(subagent_type="Explore", prompt="Verify hypothesis: race condition in login flow — trace async calls in src/auth/login.ts")
+Agent(subagent_type="Explore", prompt="Verify hypothesis: env var missing in production — check .env.example vs deployed config")
 ```
 Each agent returns CONFIRMED, REFUTED, or INCONCLUSIVE with evidence.
@@ -42,10 +42,10 @@ Each agent returns CONFIRMED, REFUTED, or INCONCLUSIVE with evidence.
 After implementing a fix, validate from every angle simultaneously:
 ```
-Task("Bash", "npx tsc --noEmit")           // Typecheck
-Task("Bash", "npx eslint src/ --quiet")     // Lint
-Task("Bash", "npm run build")               // Build
-Task("Bash", "npm test -- --bail")           // Tests
+Bash: npx tsc --noEmit              // Typecheck
+Bash: npx eslint src/ --quiet        // Lint
+Bash: npm run build                  // Build
+Bash: npm test -- --bail             // Tests
 ```
 All four must pass. If any fails, investigate that specific failure before re-attempting.
@@ -79,10 +79,10 @@ In complex bugs (Deep workflow), Steps 1+2+3 should run **concurrently** to save
 ```
 // All three launch simultaneously:
-Task("Explore", "Scout: map affected files, dependencies, and test coverage")
+Agent(subagent_type="Explore", prompt="Scout: map affected files, dependencies, and test coverage")
 // Meanwhile, main agent starts diagnosis using available error context
 // Meanwhile:
-Task("researcher", "Research: find latest docs and known issues for [library/framework]")
+Agent(subagent_type="researcher", prompt="Research: find latest docs and known issues for [library/framework]")
 ```
 The main agent begins hypothesis formation (Step 2) immediately using the error message and stack trace. Scout results enrich the diagnosis when they arrive. Research results inform the fix approach.
@@ -98,6 +98,6 @@ The main agent begins hypothesis formation (Step 2) immediately using the error
 ## Fallback: When Task Tools Are Unavailable
-`TaskCreate`/`TaskUpdate` are CLI-only — they error in some IDE extensions. If they fail:
+`TaskCreate`/`TaskUpdate` are task-list tools and can be unavailable in some runtimes. If they fail:
 - Track progress manually using markdown checklists
 - The fix workflow itself remains fully functional — Tasks add visibility, not core logic

package/src/claude/skills/hotfix/references/prevention-gate.md CHANGED Viewed

@@ -1,6 +1,6 @@
 # Prevention Gate
-After a fix is verified, apply defense-in-depth to prevent the same bug class from recurring.
+After a fix is verified, apply defense-in-depth to prevent the same bug class from recurring. Prevention is not the same as side-effect safety; run `references/debugger/side-effect-gate.md` before claiming completion.
 ## Mandatory Prevention Checklist
@@ -29,6 +29,13 @@ If the bug could recur silently:
 - Include structured context (IDs, timestamps, relevant state)
 - Ensure the log level is appropriate (warn for recoverable, error for critical)
+### 5. Layered Defense Guard
+If the same invalid state could enter through multiple paths:
+- Validate at the entry point where data first crosses a trust boundary
+- Validate in business logic where invariants must hold
+- Add environment/config guards when deployment conditions matter
+- Add diagnostic logging only at decision points that help future debugging
 ## Quick Mode Prevention
 For trivial fixes (lint, type errors), prevention is optional but encouraged:

package/src/claude/skills/hotfix/references/workflow-specialized.md CHANGED Viewed

@@ -27,6 +27,7 @@ When unit/integration/e2e tests are failing:
 2. **Check for pollution:** Does it pass alone but fail in suite? → Test order dependency or shared state leaking
 3. **Check for staleness:** Did the implementation change but the test assertions were not updated?
 4. **Snapshot drift:** If snapshot tests fail, review the diff carefully — is it an intentional change or a regression?
+5. **Flaky async:** Replace arbitrary sleeps with condition-based waits. See `references/debugger/condition-based-waiting.md`.
 ---
@@ -51,7 +52,8 @@ When the interface is broken, misaligned, or not rendering:
 1. **Capture visual evidence:** Use `pushd skills/chrome-devtools/scripts && node screenshot.js --url <url> && popd`
 2. **Check ARIA structure:** `pushd skills/chrome-devtools/scripts && node aria-snapshot.js --url <url> && popd` — reveals hidden overlays, z-index battles
 3. **Check console errors:** `pushd skills/chrome-devtools/scripts && node console.js --url <url> && popd` — catches JS crashes preventing render
-4. **Common traps:**
+4. **Run frontend verification:** Follow `references/debugger/frontend-verification.md` for screenshot, console, network, accessibility, responsive, and interaction evidence.
+5. **Common traps:**
    - CSS specificity wars (use browser DevTools or ARIA snapshot to verify computed styles)
    - Hydration mismatch in SSR frameworks (server HTML differs from client render)
    - Missing responsive breakpoints (test at multiple viewport widths)

package/src/claude/skills/inspect/SKILL.md CHANGED Viewed

@@ -37,7 +37,7 @@ Instead of rejecting, use a **2-phase approach**: first run a lightweight Struct
 Spawn **one dedicated scout agent** to map the top-level structure before any work is divided:
-1. **Discover top-level layout** - Use `Glob` / `LS` to list immediate children of the scope root:
+1. **Discover top-level layout** - Use `Glob` or `Bash` `ls` to list immediate children of the scope root:
    - Top-level directories (src/, apps/, backend/, frontend/, packages/, etc.)
    - Key config files (README.md, package.json, tsconfig.json, pyproject.toml, go.mod, etc.)
    - Monorepo markers (packages/*, apps/*, lerna.json, pnpm-workspace.yaml, turbo.json)
@@ -113,7 +113,7 @@ Follow-up: "Want to investigate deeper? Choose: backend API | frontend component
 ## Configuration
-Read from `packages/spec/src/claude/runtime.json`:
+Read from `.claude/runtime.json`:
 ```json
 {
   "gemini": {