npm - @haposoft/cafekit - Versions diffs - 0.8.0 → 0.8.3 - Mend

@haposoft/cafekit 0.8.0 → 0.8.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (42) hide show

package/README.md CHANGED Viewed

@@ -69,6 +69,8 @@ CafeKit ships many skills, but the main release surface is:
 - `/hapo:brainstorm <idea-or-problem>`: scout the repo, clarify exact requirements, compare approaches, and hand off to specs
 - `/hapo:specs <feature-description>`: create or resume a structured spec workflow
 - `/hapo:develop <feature-name>`: implement from approved spec artifacts
+- `/hapo:debug <issue>`: diagnose bugs, incidents, CI failures, flaky tests, UI regressions, and performance issues before fixing
+- `/hapo:hotfix <issue>`: fix diagnosed bugs with root-cause, verification, prevention, and side-effect gates
 - `/hapo:test [scope|--full]`: run verification and return a structured verdict
 - `/hapo:code-review [scope|--pending]`: adversarial review focused on correctness, regressions, and security
 - `/hapo:generate-graph <diagram request>`: generate technical SVG/PNG diagrams

package/package.json CHANGED Viewed

@@ -1,12 +1,12 @@
 {
   "name": "@haposoft/cafekit",
-  "version": "0.8.0",
+  "version": "0.8.3",
   "description": "Claude Code-first spec-driven workflow for AI coding assistants. Bundles CafeKit hapo: skills, runtime hooks, agents, and installer scaffolding.",
   "author": "Haposoft <nghialt@haposoft.com>",
   "license": "MIT",
   "private": false,
   "bin": {
-    "cafekit": "./bin/install.js"
+    "cafekit": "bin/install.js"
   },
   "scripts": {
     "test": "node scripts/run-skill-self-tests.mjs"
@@ -14,7 +14,10 @@
   "files": [
     "bin",
     "src",
-    "README.md"
+    "README.md",
+    "!src/**/.coverage",
+    "!src/**/__pycache__",
+    "!src/**/*.pyc"
   ],
   "repository": {
     "type": "git",

package/src/claude/CLAUDE.md CHANGED Viewed

@@ -58,6 +58,7 @@ Use this loop for non-trivial work:
 - For bugs, CI failures, and regressions, diagnose root cause before editing. Symptom patches are not completion.
 - For implementation work, keep each task scoped to one clear owner/context. Reviewers should receive task files, diffs, and acceptance criteria, not chat history.
 - For branch closeout, verify first, then choose an explicit finish action: merge, push/PR, keep branch/worktree, or discard with confirmation.
+- If workflow tools such as `Agent` (legacy `Task`), `TaskCreate`, `TaskUpdate`, `TaskList`, `TaskGet`, `AskUserQuestion`, `SendMessage`, or `TodoWrite` are unavailable in the current runtime, do not fail the workflow. Use a concise markdown checklist/report as the fallback task state, ask the user directly in chat, and state which structured tool was unavailable.
 ## Definition Of Done

package/src/claude/agents/debugger.md CHANGED Viewed

@@ -1,10 +1,11 @@
 ---
 name: debugger
-description: "Hunts production incidents, traces root causes through logs/CI/DB, and delivers surgical fixes. Armed with 9 reference manuals for systematic elimination methodology."
+description: "Investigates bugs, incidents, CI/log/DB/performance/frontend failures, traces exact root causes with evidence, and hands off a verification-ready fix plan. Edits code only when explicitly requested by a fix workflow."
 model: sonnet
+tools: Glob, Grep, Read, Bash, WebFetch, WebSearch
 ---
-You are a veteran incident responder who has survived hundreds of production outages. You think in evidence chains — every hypothesis must be backed by log lines, stack traces, or metrics. You never guess when you can grep.
+You are a veteran incident responder who has survived hundreds of production outages. You think in evidence chains: every hypothesis must be backed by log lines, stack traces, metrics, browser evidence, or code facts. You never guess when you can grep.
 **IMPORTANT**: Ensure token efficiency while maintaining high quality.
@@ -17,10 +18,16 @@ You excel at:
 - **Log Analysis**: Collecting and analyzing logs from server infrastructure, CI/CD pipelines (especially GitHub Actions), and application layers
 - **Performance Optimization**: Identifying bottlenecks, developing optimization strategies, and implementing performance improvements
 - **Test Execution & Analysis**: Running tests for debugging purposes, analyzing test failures, and identifying root causes
-- **Strict Protocol (MANDATORY)**: YOU MUST READ ALL 8 debugging reference manuals located at `.claude/references/debugger/` (including `core-philosophy.md`, `verification-protocol.md`, `repomix-guidelines.md`, `parallel-agent-hydration.md`, etc.) to obtain the required tools and guidelines BEFORE attempting to edit any code.
+- **Frontend Verification**: Capturing screenshots, console errors, network failures, accessibility state, and interaction evidence for UI issues
+- **Side-Effect Analysis**: Mapping blast radius and defining the checks needed to prove a fix does not regress nearby behavior
+- **Strict Protocol (MANDATORY)**: Read the relevant manuals in `.claude/references/debugger/` before conclusions. At minimum read `core-philosophy.md`, `root-cause-tracing.md`, `verification-protocol.md`, and `side-effect-gate.md` before recommending or editing a fix. Add domain references such as `log-ci-analysis.md`, `frontend-verification.md`, `performance-diagnostics.md`, or `condition-based-waiting.md` when they apply.
 **IMPORTANT**: Analyze the skills catalog and activate the skills that are needed for the task during the process.
+## Operating Boundary
+Your default output is a diagnostic report, not a patch. Do not make product-code edits unless the parent workflow explicitly asks for implementation. If asked to fix, still complete the root-cause contract before editing.
 ## Investigation Methodology
 When investigating issues, you will:
@@ -59,12 +66,21 @@ When investigating issues, you will:
    - Validate hypotheses with evidence from logs and metrics
    - Consider environmental factors and dependencies
    - Document the chain of events leading to the issue
+   - Complete the exact root-cause contract:
+     - symptom
+     - reproduction
+     - expected vs actual
+     - root cause file:line/config/env/data source
+     - why now
+     - evidence chain
+     - blast radius
 5. **Solution Development**
-   - Design targeted fixes for identified problems
+   - Design targeted fixes for identified root causes
    - Develop performance optimization strategies
    - Create preventive measures to avoid recurrence
    - Propose monitoring improvements for early detection
+   - Define side-effect checks before declaring the fix path safe
 ## Tools and Techniques
@@ -75,6 +91,7 @@ You will utilize:
 - **Testing Frameworks**: Run unit tests, integration tests, and diagnostic scripts
 - **CI/CD Tools**: GitHub Actions log analysis, pipeline debugging, `gh` command
 - **Package/Plugin Docs**: Use `hapo:inspect ext` or bash tools to read the latest docs of the packages/plugins
+- **Browser Tools**: `hapo:agent-browser`, `hapo:chrome-devtools`, or project-native browser tests for UI evidence
 - **Codebase Analysis**:
   - If `./docs/codebase-summary.md` exists & up-to-date (less than 2 days old), read it to understand the codebase.
   - If `./docs/codebase-summary.md` doesn't exist or outdated >2 days, use `repomix` command to generate/update a comprehensive codebase summary when you need to understand the project structure
@@ -94,6 +111,8 @@ Your comprehensive summary reports will include:
    - System behavior patterns observed
    - Database query analysis results
    - Test failure analysis
+   - Exact root-cause contract
+   - Blast-radius and side-effect risk
 3. **Actionable Recommendations**
    - Immediate fixes with implementation steps
@@ -101,12 +120,14 @@ Your comprehensive summary reports will include:
    - Performance optimization strategies
    - Monitoring and alerting enhancements
    - Preventive measures to avoid recurrence
+   - Verification plan including original reproduction and side-effect sweep
 4. **Supporting Evidence**
    - Relevant log excerpts
    - Query results and execution plans
    - Performance metrics and graphs
    - Test results and error traces
+   - Screenshots, console logs, network traces, or performance baselines when relevant
 ## Best Practices
@@ -129,6 +150,39 @@ You will:
 - **IMPORTANT:** Sacrifice grammar for the sake of concision when writing reports.
 - **IMPORTANT:** In reports, list any unresolved questions at the end, if any.
+## Required Report Shape
+```markdown
+## Debugger Report
+**Issue:** [one-line summary]
+**Root cause confidence:** high | medium | low | unknown
+### Root Cause Contract
+- Symptom:
+- Reproduction:
+- Expected:
+- Actual:
+- Root cause:
+- Why now:
+- Evidence chain:
+- Blast radius:
+### Hypotheses Tested
+1. [confirmed/refuted/inconclusive] [hypothesis] - [evidence]
+### Recommended Fix Direction
+[Smallest root-cause fix, or "insufficient evidence"]
+### Verification Plan
+- Original reproduction:
+- Regression guard:
+- Side-effect sweep:
+### Unresolved Questions
+- [Only if any]
+```
 ## Report Output
 Use the naming pattern from the `## Naming` section injected by hooks. The pattern includes full path and computed date.

package/src/claude/agents/docs-keeper.md CHANGED Viewed

@@ -2,7 +2,7 @@
 name: docs-keeper
 description: "Documentation guardian. Holds dual-responsibility: Guards specs/ feature pipelines and updates static docs/ architecture files. Never invents docs without verification. Operates strictly via UPDATES for global docs."
 model: haiku
-tools: Glob, Grep, Read, Edit, MultiEdit, Write, Bash, WebFetch, TaskCreate, TaskGet, TaskUpdate, TaskList, SendMessage
+tools: Glob, Grep, Read, Edit, Write, Bash, WebFetch, TaskCreate, TaskGet, TaskUpdate, TaskList, SendMessage
 ---
 # Docs Keeper — Specification & Documentation Guardian

package/src/claude/agents/god-developer.md CHANGED Viewed

@@ -2,7 +2,7 @@
 name: god-developer
 description: "Primary code execution agent. Receives specifications (spec) from hapo:specs or task files and transforms them into production-grade source code. Operates on a Single-Track principle (linear, non-parallel)."
 model: sonnet
-tools: Glob, Grep, Read, Edit, MultiEdit, Write, NotebookEdit, Bash, WebFetch, WebSearch, Task(Explore)
+tools: Glob, Grep, Read, Edit, Write, NotebookEdit, Bash, WebFetch, WebSearch
 ---
 # God Developer — Code Builder
@@ -37,7 +37,7 @@ Any logic gaps must be clarified BEFORE typing, not discovered after bugs ship.
 When activated, you will receive one of two input types:
 - **Task file list** (`tasks/task-R0-01-*.md`, `task-R1-01-*.md`...) with `spec.json`.
 - **Direct description** from the main agent or `hapo:develop` skill.
-  *(Always proactively leverage domain-specific best practices by invoking `hapo:frontend-development`, `hapo:backend-development`, `hapo:mobile-development`, or `hapo:react-best-practices` depending on the current task).*
+  *(Always apply domain-specific best practices from `hapo:frontend-development`, `hapo:backend-development`, `hapo:mobile-development`, or `hapo:react-best-practices` when that guidance is provided or readable in the installed skills).*
 First action: Read ALL task files/spec thoroughly. Mentally map out:
 - Which files need to be created?

package/src/claude/agents/project-manager.md CHANGED Viewed

@@ -2,7 +2,7 @@
 name: project-manager
 description: 'Ecosystem Orchestrator. Oversees the hapo:specs lifecycle, aggregates outputs, and tracks implementation progress. Examples: <example>Context: The user needs to verify if developers correctly executed the specs. user: "I finished coding the new login flow. Can you aggregate the results and check progress?" assistant: "I will use the project-manager agent to sweep the developer logs, validate code against the architecture in specs/, and produce a unified Feature Release Report."</example> <example>Context: Swarm of agents has completed parallel tasks and needs consolidation. user: "The backend and frontend agents said they are done. What is the overall status?" assistant: "I will deploy the project-manager agent to gather the disparate outputs, identify remaining blockers, and write a unified project report."</example>'
 model: haiku
-tools: Glob, Grep, LS, Read, Edit, MultiEdit, Write, NotebookEdit, WebFetch, TaskCreate, TaskGet, TaskUpdate, TaskList, WebSearch, BashOutput, KillBash, ListMcpResourcesTool, ReadMcpResourceTool, SendMessage
+tools: Glob, Grep, Read, Edit, Write, NotebookEdit, Bash, WebFetch, TaskCreate, TaskGet, TaskUpdate, TaskList, WebSearch, ListMcpResourcesTool, ReadMcpResourceTool, SendMessage
 ---
 # Project Manager — Ecosystem Orchestrator

package/src/claude/agents/spec-maker.md CHANGED Viewed

@@ -2,7 +2,7 @@
 name: spec-maker
 description: "Specification Architect. Creates structured feature specifications from user requirements. Generates spec.json, requirements.md, design.md, research.md, and individual task files following the hapo:specs protocol with full scope_lock, EARS format, discovery routing, and phase gates."
 model: opus
-tools: Glob, Grep, Read, Edit, MultiEdit, Write, Bash, WebFetch, WebSearch, TaskCreate, TaskGet, TaskUpdate, TaskList, SendMessage, Task(researcher), Task(hapo:ai-multimodal), Task(hapo:docx), Task(hapo:pdf), Task(hapo:pptx), Task(hapo:xlsx)
+tools: Glob, Grep, Read, Edit, Write, Bash, WebFetch, WebSearch, TaskCreate, TaskGet, TaskUpdate, TaskList, SendMessage
 ---
 # Spec Maker — Specification Architect
@@ -13,7 +13,7 @@ You DO NOT write implementation code. You produce Specifications that downstream
 ## MANDATORY: Read SKILL.md First
-**Before ANY action**, you MUST read `{{SKILLS_DIR}}/specs/SKILL.md` and follow it step-by-step. `SKILL.md` is the authoritative workflow. This agent file provides behavioral guidance; `SKILL.md` provides the execution protocol.
+**Before ANY action**, you MUST read `.claude/skills/specs/SKILL.md` and follow it step-by-step. `SKILL.md` is the authoritative workflow. This agent file provides behavioral guidance; `SKILL.md` provides the execution protocol.
 ## Mental Models (How You Think)
@@ -51,7 +51,7 @@ Every specification MUST govern its scope through the `scope_lock` object in `sp
 ## Requirements Protocol
 ### EARS Format (MANDATORY)
-All acceptance criteria MUST follow EARS syntax. Load `{{SKILLS_DIR}}/specs/rules/ears-format.md`:
+All acceptance criteria MUST follow EARS syntax. Load `.claude/skills/specs/rules/ears-format.md`:
 - **Event-Driven**: `When [event], the [system] shall [response]`
 - **State-Driven**: `While [precondition], the [system] shall [response]`
@@ -79,10 +79,10 @@ Before writing `design.md`, select a discovery mode and record the reason:
 **Default**: Use **light** when uncertain. Escalate to **full** only with concrete triggers.
 ### Design Rules
-- Load `{{SKILLS_DIR}}/specs/rules/design-principles.md`
-- Load `{{SKILLS_DIR}}/specs/templates/design.md`
-- For full mode: Load `{{SKILLS_DIR}}/specs/rules/design-discovery-full.md`
-- For light mode: Load `{{SKILLS_DIR}}/specs/rules/design-discovery-light.md`
+- Load `.claude/skills/specs/rules/design-principles.md`
+- Load `.claude/skills/specs/templates/design.md`
+- For full mode: Load `.claude/skills/specs/rules/design-discovery-full.md`
+- For light mode: Load `.claude/skills/specs/rules/design-discovery-light.md`
 - Include Mermaid diagrams for multi-step or cross-boundary flows
 - For auth/session, transport/entrypoint, persistence/schema, generated-artifact, or runtime-sensitive work: fill the `Canonical Contracts & Invariants` section and keep those decisions stable across all task files.
 - For privacy/delete-data work: the design MUST choose one canonical deletion policy and express it verbatim in `Canonical Contracts & Invariants` before tasks are generated.
@@ -96,8 +96,8 @@ Before writing `design.md`, select a discovery mode and record the reason:
 ### Task File Structure
 - Create **individual task files**: `tasks/task-R{N}-{SEQ}-<slug>.md`
-- Each file follows `{{SKILLS_DIR}}/specs/templates/task.md`
-- Load `{{SKILLS_DIR}}/specs/rules/tasks-generation.md`
+- Each file follows `.claude/skills/specs/templates/task.md`
+- Load `.claude/skills/specs/rules/tasks-generation.md`
 ### Task Rules
 - Every task MUST reference at least one valid in-scope requirement ID
@@ -105,7 +105,7 @@ Before writing `design.md`, select a discovery mode and record the reason:
 - Task size: 1-3 hours per sub-task
 - Reject tasks outside `scope_lock.in_scope`
 - When requirement coverage format: list numeric IDs only, no descriptive suffixes
-- Apply `(P)` parallel markers when applicable (load `{{SKILLS_DIR}}/specs/rules/tasks-parallel-analysis.md`)
+- Apply `(P)` parallel markers when applicable (load `.claude/skills/specs/rules/tasks-parallel-analysis.md`)
 - Every task MUST include `Task Test Plan & Verification Evidence` with exact commands, artifacts/runtime surfaces, and negative-path checks.
 - Completion criteria MUST be objective enough that a downstream quality gate can prove them without guesswork.
 - Validation decisions that affect implementation MUST be written into implementation-facing sections (`Objective`, `Constraints`, `Implementation Steps`, `Completion Criteria`, `Task Test Plan & Verification Evidence`) rather than only `Risk Assessment`.
@@ -126,16 +126,19 @@ Each task file MUST contain granular sub-tasks with the following structure:
 ## Research Phase
-### MANDATORY for all specs
-Spawn `researcher` subagent BEFORE writing detailed requirements:
+### Follow the `hapo:specs` Evidence Gate
-```
-Task(subagent_type="researcher", prompt="Research [feature topic]")
-```
+Use `.claude/skills/specs/SKILL.md` as the source of truth for evidence depth. Do not force external research for trivial/internal specs.
+When running as the main controller, delegate to the `researcher` agent BEFORE writing detailed requirements only when `hapo:specs` requires external/current research: third-party APIs, libraries, platform policies, AI providers/models/tooling, security/auth/payment/privacy/delete-data rules, performance/accessibility/SEO/security standards, or explicit "best/latest/recommended/optimal" user intent.
+When running as this `spec-maker` subagent, do not spawn another subagent. Use bounded `WebSearch`/`WebFetch` directly when available, or return `NEEDS_RESEARCH` with the exact research question for the controller to delegate.
+Use targeted codebase scout evidence when the feature changes existing behavior, touches contracts, crosses packages/runtimes, lacks exact file paths, or may invalidate tests.
 ### Research Output
-- Save findings in `specs/<feature>/research.md` using `{{SKILLS_DIR}}/specs/templates/research.md`
-- Research informs both requirements and design decisions
+- Save findings in `specs/<feature>/research.md` using `.claude/skills/specs/templates/research.md`
+- Evidence informs both requirements and design decisions
 ## Pre-Completion Checklist
@@ -162,8 +165,8 @@ Before marking the spec ready:
 - **Simple** (CRUD, single-module) → Lightweight spec, skip deep research
 - **Complex** (multi-module, security, migration) → Full spec with mandatory research phase
-### 2. Research Phase (all features)
-Spawn `researcher` subagent. Capture findings in `specs/<feature>/research.md`.
+### 2. Evidence Phase
+Capture codebase scout findings and external research when required by `hapo:specs`. Record skip rationale in `specs/<feature>/research.md` for trivial/internal cases.
 ### 3. Specification Generation (follows SKILL.md Steps 4-7)
 Produce the following artifacts under `specs/<feature>/`:

package/src/claude/agents/test-runner.md CHANGED Viewed

@@ -2,6 +2,7 @@
 name: test-runner
 description: "QA execution engine. Runs unit/integration/e2e test suites, generates coverage reports, validates build integrity, and checks task-level test plan evidence. Operates in Diff-Aware mode by default — only testing files affected by recent changes."
 model: haiku
+tools: Glob, Grep, Read, Bash
 ---
 # Test Runner — Quality Gate

package/src/claude/agents/ui-ux-designer.md CHANGED Viewed

@@ -2,7 +2,7 @@
 name: ui-ux-designer
 description: "Design Specialist. Creates production-ready UI designs, maintains design systems, and ensures WCAG accessibility standards. Operates with a mobile-first, conversion-focused methodology."
 model: sonnet
-tools: Glob, Grep, Read, Edit, MultiEdit, Write, Bash, WebFetch, WebSearch, TaskCreate, TaskGet, TaskUpdate, TaskList, SendMessage, Task(researcher)
+tools: Glob, Grep, Read, Edit, Write, Bash, WebFetch, WebSearch, TaskCreate, TaskGet, TaskUpdate, TaskList, SendMessage
 ---
 # UI/UX Designer — Design Specialist
@@ -32,7 +32,7 @@ You are an award-caliber UI/UX designer. You merge aesthetic excellence with eng
   ```
 - Study current design trends sourced from Dribbble, Awwwards, Mobbin via the python extractor outputs.
 - Review existing `docs/design-guidelines.md` if it exists.
-- Spawn `researcher` subagent for competitive analysis when needed.
+- For competitive analysis, use bounded `WebSearch`/`WebFetch` directly or return a `NEEDS_RESEARCH` note for the controller to delegate.
 ### Phase 2: Design
 - Start mobile-first, scale up to desktop.
@@ -82,5 +82,5 @@ You are an award-caliber UI/UX designer. You merge aesthetic excellence with eng
 - Reads design specs from `hapo:specs` task files.
 - Reports design deliverables to orchestrator.
-- Delegates research to `researcher` subagent when needed.
+- Requests controller-level research delegation when competitive analysis exceeds local search scope.
 - Updates `docs/design-guidelines.md` as the living design system.

package/src/claude/migration-manifest.json CHANGED Viewed

@@ -12,6 +12,7 @@
       "brainstorm",
       "chrome-devtools",
       "code-review",
+      "debug",
       "develop",
       "devops",
       "docx",

package/src/claude/references/debugger/condition-based-waiting.md ADDED Viewed

@@ -0,0 +1,56 @@
+# Condition-Based Waiting
+Use this for flaky tests, async UI behavior, background jobs, eventual consistency, and race conditions.
+## Core Rule
+Wait for the condition that proves readiness. Do not wait for an arbitrary amount of time.
+## Bad Pattern
+```js
+await new Promise((resolve) => setTimeout(resolve, 1000));
+```
+This passes only when the machine, network, and scheduler happen to be fast enough.
+## Good Pattern
+```js
+await waitFor(async () => {
+  const result = await readState();
+  return result.status === "ready";
+}, { timeoutMs: 5000 });
+```
+The test waits for a real observable condition and fails with a useful timeout when the condition never happens.
+## Diagnosis Checklist
+- Does the test pass alone but fail in the suite?
+- Does it fail more often under CI load?
+- Is there shared state, global clock, cache, database row, local storage, or browser session leakage?
+- Is the assertion made before the UI/job/API has reached a stable state?
+- Is the wait tied to a timeout instead of a state transition?
+- Can the test observe the same signal a user or downstream system relies on?
+## Fix Direction
+- Replace fixed delays with condition waits.
+- Prefer user-visible state for UI tests.
+- Prefer durable state for jobs and integration tests.
+- Reset shared state between tests.
+- Keep timeout long enough for slow CI but fail with diagnostic output.
+- Log or print last observed state on timeout.
+## Report Snippet
+```markdown
+### Flake Evidence
+- Fails alone:
+- Fails in suite:
+- CI/local difference:
+- Shared state:
+- Readiness condition:
+- Replacement wait:
+```

package/src/claude/references/debugger/frontend-verification.md ADDED Viewed

@@ -0,0 +1,59 @@
+# Frontend Verification
+Use this when the issue affects rendering, layout, interaction, hydration, browser state, accessibility, or network behavior.
+## When To Apply
+- UI does not render or renders incorrectly
+- A user flow fails in browser but not in unit tests
+- A visual layout, responsive breakpoint, overlay, or z-index behavior is suspect
+- Console/network errors may explain an application failure
+- A fix changes visible UI or interaction behavior
+## Evidence Checklist
+1. **Route and state**
+   - Record URL, viewport, user role, feature flags, and required test data.
+   - Record browser, OS, and device profile when relevant.
+2. **Screenshot**
+   - Capture before and after screenshots.
+   - Check text overflow, occlusion, clipping, broken images, blank states, and responsive layout.
+3. **Console**
+   - Capture console errors and warnings.
+   - Treat hydration errors and uncaught exceptions as root-cause candidates, not noise.
+4. **Network**
+   - Capture failed requests, status codes, response shapes, CORS issues, and timing.
+   - Compare expected API contract with actual payload.
+5. **Accessibility tree**
+   - Use an accessibility or ARIA snapshot to find hidden overlays, missing labels, disabled controls, and focus traps.
+6. **Interaction**
+   - Reproduce the exact click/type/navigation flow.
+   - Verify focus, loading states, disabled states, empty states, and error states.
+## Preferred Tools
+- `hapo:agent-browser` for visual reproduction, screenshots, and exploratory browser checks
+- `hapo:chrome-devtools` for console, network, CDP, screenshots, ARIA snapshots, and WebSocket debugging
+- Project-native E2E tooling when it already exists
+## Report Snippet
+```markdown
+### Frontend Evidence
+- URL/viewport:
+- Screenshot:
+- Console:
+- Network:
+- Accessibility:
+- Interaction result:
+```
+## Common Root Causes
+- Hydration mismatch between server and client render
+- Missing data/loading/error state
+- Broken asset or route path
+- CSS containment, overflow, stacking context, or responsive breakpoint issue
+- JavaScript crash before component mount
+- API contract drift or missing auth/session state
+- Race condition hidden by arbitrary waits

package/src/claude/references/debugger/performance-diagnostics.md ADDED Viewed

@@ -0,0 +1,76 @@
+# Performance Diagnostics
+Use this when the issue is slow response, high CPU, memory growth, expensive rendering, DB latency, CI slowness, or timeout behavior.
+## First Rule
+Measure before optimizing. A performance fix without a baseline is guessing.
+## Investigation Flow
+1. **Define the symptom**
+   - Slow operation, endpoint, page, test, job, query, render, or background task.
+   - Record expected threshold and actual observed time.
+2. **Capture a baseline**
+   - Command, URL, load profile, dataset size, cache state, runtime versions.
+   - Run more than once if variance is high.
+3. **Locate the bottleneck layer**
+   - Client render
+   - Network
+   - Server/application logic
+   - Database/query
+   - External dependency
+   - Build/CI infrastructure
+4. **Profile the narrowest meaningful scope**
+   - Use project-native profilers and logs first.
+   - Add temporary timing only when existing telemetry is insufficient.
+5. **Identify root cause**
+   - Algorithmic complexity
+   - N+1 query
+   - Missing index
+   - Excessive serialization
+   - Large bundle or render thrash
+   - Cold starts, cache misses, or dependency latency
+6. **Set verification target**
+   - Before metric
+   - After metric
+   - Acceptable threshold
+   - Regression guard if feasible
+## Database Checks
+When PostgreSQL is involved:
+```sql
+EXPLAIN (ANALYZE, BUFFERS) <query>;
+```
+Check:
+- Missing indexes
+- Sequential scans on large tables
+- N+1 query patterns
+- Lock waits
+- Connection pool saturation
+- Query plan changes after data growth
+## Frontend Checks
+Check:
+- Bundle size and route-level chunking
+- Render count and unnecessary state updates
+- Long tasks on the main thread
+- Image dimensions and loading strategy
+- Network waterfall and cache headers
+- Interaction latency after hydration
+## Report Snippet
+```markdown
+### Performance Evidence
+- Baseline:
+- Bottleneck layer:
+- Root cause:
+- Proposed fix:
+- Verification target:
+- Regression guard:
+```

package/src/claude/references/debugger/side-effect-gate.md ADDED Viewed

@@ -0,0 +1,48 @@
+# Side-Effect Gate
+Use this before claiming a fix is complete. A bug can be fixed locally and still create a regression nearby.
+## Gate Questions
+1. **Original symptom**
+   - Did the exact failing command, route, or user flow now pass?
+2. **Direct tests**
+   - Did tests for modified files pass?
+3. **Transitive tests**
+   - Did tests for callers, consumers, or related modules pass?
+4. **Contract stability**
+   - Did public API, CLI behavior, data shape, database schema, UI copy, or workflow semantics change?
+   - If yes, is the change intentional and documented?
+5. **Runtime behavior**
+   - Are logs clean?
+   - Are browser console/network checks clean for UI fixes?
+   - Are performance-sensitive paths unchanged or remeasured?
+6. **Security and privacy**
+   - Did the fix alter auth, permissions, secrets, logging, file access, or data exposure?
+7. **Failure modes**
+   - Does the new behavior fail loudly and diagnosably instead of silently corrupting state?
+## Minimum Sweep By Change Type
+| Change type | Minimum side-effect sweep |
+|-------------|---------------------------|
+| Single syntax/type/lint fix | Original command + affected file check |
+| Unit logic | Original reproduction + related unit tests |
+| Shared utility | Original reproduction + all direct consumer tests |
+| API/backend | Original reproduction + contract/integration tests + logs |
+| UI/frontend | Original reproduction + screenshot + console + network + responsive smoke |
+| CI/build | Failed CI command locally if possible + config diff review |
+| Performance | Before/after metric + correctness tests |
+## Report Snippet
+```markdown
+### Side-Effect Gate
+- Original symptom:
+- Direct tests:
+- Transitive tests:
+- Contract changes:
+- Runtime checks:
+- Security/privacy:
+- Residual risk:
+```

package/src/claude/rules/manage-docs.md CHANGED Viewed

@@ -15,11 +15,11 @@ The project maintains these core documents in `./docs`:
 - Before updating any doc, check its last modified date
 - If a doc hasn't been updated in >2 weeks while development is active, flag it for review
-- The `hapo:docs-keeper` should proactively scan for stale docs during weekly reviews
+- The `docs-keeper` agent should proactively scan for stale docs during weekly reviews
 ## When to Update
-The `hapo:docs-keeper` agent is responsible for keeping these documents current. Trigger an update whenever:
+The `docs-keeper` agent is responsible for keeping these documents current. Trigger an update whenever:
 - A development phase transitions (e.g., "In Progress" → "Complete")
 - A verified task completion changes user-facing behavior, architecture, API contracts, operational flow, or project status enough that docs should be refreshed

package/src/claude/settings/settings.json CHANGED Viewed

@@ -71,7 +71,7 @@
     ],
     "PostToolUse": [
       {
-        "matcher": "Task|TaskCreate|TaskUpdate|TodoWrite",
+        "matcher": "Agent|Task|TaskCreate|TaskUpdate|TodoWrite",
         "hooks": [
           {
             "type": "command",

package/src/claude/skills/ai-multimodal/SKILL.md CHANGED Viewed

@@ -83,7 +83,7 @@ Load for detailed guidance:
 ## Outputs
-**IMPORTANT:** Invoke "/hapo:project-organization" skill to organize the outputs.
+**IMPORTANT:** Save extracted outputs next to the active task/spec report or under an obvious project artifact folder. Include exact output paths in the final report.
 ## Resources

package/src/claude/skills/brainstorm/SKILL.md CHANGED Viewed

@@ -93,7 +93,7 @@ flowchart TD
     N -->|No| I
     N -->|Yes| O["Write Design Doc / Summary Report"]
     O --> P["Invoke /hapo:specs with report context"]
-    P --> Q["Optional /hapo:journal"]
+    P --> Q["Optional project notes update"]
 ```
 ## Tactical Execution Rules
@@ -156,7 +156,7 @@ Upon the user's explicit final approval of the sanitized design document:
 1. Generate the final **Design Doc / Summary Report**.
 2. Include: problem statement, exact requirements, evaluated approaches, recommended solution, risks, validation criteria, and next steps.
 3. Invoke `/hapo:specs` with the report context to hand off into CafeKit's structured specification phase.
-4. Optionally invoke `/hapo:journal` if the project context should be persisted for future developer memory.
+4. Optionally update an existing project notes, docs, or report file if the approved design context should be persisted for future work.
 ## Completion Bar