npm - @haposoft/cafekit - Versions diffs - 0.7.29 → 0.8.0 - Mend

@haposoft/cafekit 0.7.29 → 0.8.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (28) hide show

package/README.md +21 -12
package/package.json +4 -1
package/src/claude/CLAUDE.md +81 -135
package/src/claude/agents/brainstormer.md +24 -13
package/src/claude/agents/code-auditor.md +1 -1
package/src/claude/agents/spec-maker.md +2 -2
package/src/claude/agents/test-runner.md +10 -8
package/src/claude/rules/ai-dev-rules.md +36 -51
package/src/claude/rules/hook-protocols.md +35 -0
package/src/claude/rules/orchestrator.md +11 -0
package/src/claude/rules/workflow.md +41 -45
package/src/claude/skills/brainstorm/SKILL.md +123 -39
package/src/claude/skills/chrome-devtools/scripts/package.json +3 -1
package/src/claude/skills/code-review/references/spec-compliance-review.md +1 -1
package/src/claude/skills/develop/SKILL.md +4 -4
package/src/claude/skills/develop/references/quality-gate.md +2 -2
package/src/claude/skills/git/SKILL.md +19 -2
package/src/claude/skills/git/references/finish-branch.md +61 -0
package/src/claude/skills/pdf/scripts/__pycache__/check_bounding_boxes.cpython-314.pyc +0 -0
package/src/claude/skills/specs/SKILL.md +15 -6
package/src/claude/skills/specs/references/review.md +1 -1
package/src/claude/skills/specs/rules/tasks-generation.md +3 -3
package/src/claude/skills/specs/templates/task.md +4 -2
package/src/claude/skills/sync/SKILL.md +2 -2
package/src/claude/skills/sync/references/sync-protocols.md +4 -4
package/src/claude/skills/test/SKILL.md +4 -1
package/src/claude/skills/test/references/execution-strategy.md +3 -1
package/src/claude/skills/test/references/test-memory.md +2 -2

package/src/claude/rules/hook-protocols.md ADDED Viewed

@@ -0,0 +1,35 @@
+# Hook Protocols
+Hooks are instruction boundaries. Do not bypass or work around them.
+## Privacy Block Hook
+When a tool call is blocked by the privacy hook, the output contains a JSON marker between:
+```text
+@@PRIVACY_PROMPT_START@@
+...
+@@PRIVACY_PROMPT_END@@
+```
+Required flow:
+1. Parse the JSON payload from the marker.
+2. Ask the user for approval using the available user-question tool and the prompt/options from the payload.
+3. If approved, retry only the blocked action.
+4. If denied, continue without reading that file or performing that blocked action.
+Never use another command, path, encoding, or side channel to access a privacy-blocked file without explicit approval.
+## State And Spec Hooks
+- If a hook reports state drift, run the appropriate sync/audit flow before continuing.
+- If a hook rejects task completion, keep the task `in_progress` or `blocked` until proof exists.
+- Do not mark a task `done` unless the matching task file contains a valid verification receipt.
+## Hook Failure Handling
+- Treat hook errors as blockers when they affect safety, privacy, or task state.
+- Record the blocker in the task/spec state when relevant.
+- If the hook output is malformed, stop and ask for clarification instead of guessing.

package/src/claude/rules/orchestrator.md CHANGED Viewed

@@ -40,6 +40,16 @@ Use when each step relies on the output of the previous one:
 Each agent must finish completely before the next one starts. Forward relevant outputs in the handoff prompt.
+### Implementation Review Chain
+For implementation tasks, keep the chain explicit:
+1. Implement against one active task/spec scope.
+2. Verify with `test-runner` using task files and exact evidence commands.
+3. Review with `code-auditor` using task files, design contracts, and the diff.
+Do not dispatch multiple implementation agents against the same files or same task. Parallelize only independent scopes with distinct file ownership.
 ### Parallel (independent tasks)
 Spawn concurrent agents when work does not overlap:
@@ -88,6 +98,7 @@ Agent lacks information to proceed. Supply the missing context and re-dispatch.
 ## Prompt Engineering for Subagents
 Subagents operate in a fresh context — they have **zero knowledge** of the parent session.
+They should receive task files, relevant design/requirements excerpts, file paths, acceptance criteria, and current diffs. Do not rely on inherited chat history.
 ### Prompt Template

package/src/claude/rules/workflow.md CHANGED Viewed

@@ -1,60 +1,56 @@
 # Execution Workflow
-> Token efficiency matters — deliver high quality without wasting context.
+Use the CafeKit loop: **Understand -> Plan -> Execute -> Verify -> Sync**.
-## Phase 0: Understand
+## 1. Understand
-Before planning, establish project awareness:
+- Read `./README.md` before feature planning or coding.
+- Read the active spec/task file when one exists.
+- Read and activate any CafeKit skill that likely applies before taking action.
+- Inspect only the code needed to understand the affected area.
+- Use `hapo:inspect` or focused search when structure is unclear.
-- Read `./docs/codebase-summary.md` if it exists and is fresh (< 2 days old)
-- Otherwise, generate one using `repomix` or delegate to `hapo:inspect` for scoped discovery
-- Scan `./docs/code-standards.md` and `./docs/system-architecture.md` for constraints
-- Identify which parts of the codebase are affected by the upcoming work
+## 2. Plan
-## Phase 1: Plan
+- For non-trivial features, use `/hapo:specs` to create or validate the spec.
+- For approved specs, work one task file at a time.
+- Extract from the active task:
+  - `Objective`
+  - `Constraints`
+  - `Related Files`
+  - `Completion Criteria`
+  - `Task Test Plan & Verification Evidence`
+- If these are missing or too vague to verify, route back to spec correction.
-- Delegate to `hapo:spec-maker` to draft an implementation plan with actionable TODO items in `./specs`
-- For complex features, spawn multiple `hapo:researcher` agents in parallel to investigate different technical areas, then feed findings back into the plan
-- Never start coding without a clear, reviewed plan
+## 3. Execute
-## Phase 2: Execute
+- Implement only the active scope.
+- Modify existing files directly; do not create duplicate "enhanced" variants.
+- Keep named contracts from `design.md` intact.
+- Do not use placeholder wiring, process-local stand-ins, or fake adapters as completion proof.
-### 2a. Implement
+## 4. Verify
-- Produce clean, maintainable code following the project's architectural patterns
-- Modify existing files — do not create "enhanced" duplicates
-- Cover edge cases and error paths
-- Run compile/build after every file change to catch issues immediately
+- Run exact commands from `Task Test Plan & Verification Evidence` first.
+- Then run repo-level lint/test/build as needed for confidence.
+- Use only fresh verification from the current run when claiming completion.
+- `PRECHECK_FAIL` outranks `NO_TESTS`.
+- `NO_TESTS` or `0 tests + exit 0` is not a pass when automated tests are required.
+- If verification fails, fix root cause and rerun. After 3 failed attempts, escalate with evidence.
-### 2b. Test
+## 5. Sync
-- Delegate to `hapo:test-runner` to validate the **final, production-ready code**
-- Expectations for test suites:
-  - Comprehensive unit coverage
-  - Error scenario testing
-  - Performance validation where applicable
-- Absolutely **no fake data, mocks-for-passing, or temporary workarounds** to make CI green
-- If tests fail: fix the root cause, re-run via `hapo:test-runner`, repeat until all pass — never end a session with red tests
-- If a test failure persists after **3 fix attempts**, stop and escalate to the user with a diagnostic summary
+- Mark task state only after implementation, tests/evidence, and review pass.
+- Write a verification receipt with commands run, outcomes, and artifact/runtime proof.
+- Keep `spec.json.task_registry` and markdown task files aligned.
+- Run docs checkpoint when a completed task affects public docs or architecture docs.
-### 2c. Review
+## Production Or CI Issues
-- Once tests are green, delegate to `hapo:code-auditor` for a code quality pass
-- Self-documenting code is the goal; add comments only for genuinely complex logic
-- Optimize for long-term maintainability and runtime performance
+1. Capture the failing signal.
+2. Diagnose root cause with logs/tests.
+3. Implement the smallest fix.
+4. Rerun the failing check plus relevant regression checks.
+5. Review before syncing or shipping.
-## Phase 3: Integrate & Verify
-- Follow the plan established by `hapo:planner` throughout integration
-- Honor existing API contracts and preserve backward compatibility
-- Document any breaking changes explicitly
-- Delegate to `hapo:docs-keeper` to keep `./docs` in sync with the implementation
-### Handling Production Issues
-When bugs surface in production or CI/CD:
-1. Delegate to `hapo:debugger` to analyze failures and produce a diagnostic report
-2. Implement the fix based on the report
-3. Delegate to `hapo:test-runner` to verify the fix
-4. If new test failures appear, resolve them and loop back to **Phase 2c (Review)**
+Do not patch symptoms before diagnosis unless the issue is a trivial syntax/type/lint failure with an obvious local cause.

package/src/claude/skills/brainstorm/SKILL.md CHANGED Viewed

@@ -1,12 +1,51 @@
 ---
 name: hapo:brainstorm
-description: "Pragmatic Brainstorming: Socratic architectural challenges combined with step-by-step interactive design. Use for ideation, scope gating, and translating raw ideas into specs."
-version: 2.0.0
+description: "Scout-first brainstorming for unclear ideas, architectural choices, scope gates, and translating raw intent into a spec-ready design."
+argument-hint: "<idea_or_problem>"
+version: 2.1.0
 ---
 # Brainstorming Skill
-You execute the Brainstorm workflow. It is designed to aggressively interrogate assumptions, decompose features, and iteratively validate architectural solutions before any code is drafted.
+You execute CafeKit's pre-spec design workflow. Your job is to turn a raw idea into a validated, spec-ready design without writing code or starting implementation.
+`hapo:brainstorm` is the workflow entrypoint. The `brainstormer` agent is a specialist you may call for difficult architectural debate; it is not a replacement for this workflow.
+## Core Stance
+- Scout before asking. Do not ask generic questions when the repository can make the question concrete.
+- Make requirements exact before proposing architecture.
+- Challenge assumptions without turning the session into a debate for its own sake.
+- Prefer the simplest design that satisfies the agreed acceptance criteria.
+- Keep design approval incremental: small sections, explicit confirmation, then handoff.
+<HARD-GATE>
+Do NOT invoke implementation skills, write code, scaffold files, modify source, or begin `/hapo:develop` until the design has been presented and explicitly approved by the user.
+</HARD-GATE>
+<HARD-GATE-SCOUT-FIRST>
+Before asking clarifying questions or proposing approaches, run `hapo:inspect` or perform a narrow equivalent scout.
+Mandatory scout findings:
+1. Project type, primary languages, and major frameworks.
+2. Existing files/modules relevant to the topic.
+3. Current patterns or conventions for similar work.
+4. Relevant docs, specs, or plans already present.
+5. Constraints discovered from code, schemas, APIs, naming, or runtime setup.
+Then summarize the useful findings to the user in 3-6 bullets before Discovery.
+</HARD-GATE-SCOUT-FIRST>
+<HARD-GATE-EXACT-REQUIREMENTS>
+Before proposing solutions, capture concrete answers for:
+1. Expected output: feature behavior, artifact, UI surface, API shape, CLI behavior, or document.
+2. Acceptance criteria: observable checks that prove done.
+3. Scope boundary: what is explicitly out of scope for this round.
+4. Non-negotiable constraints: stack, files, compatibility, deadlines, naming, runtime.
+5. Touchpoints: existing files/modules/data/contracts the design will affect.
+If any item is vague, ask one more grounded question. Do not proceed with phrases like "make it better" or "add validation" unless you have concrete examples.
+</HARD-GATE-EXACT-REQUIREMENTS>
 ## Anti-Rationalization
@@ -22,63 +61,108 @@ You execute the Brainstorm workflow. It is designed to aggressively interrogate
 Leverage these specific tools or sub-agents to execute the workflow effectively:
 - `AskUserQuestion`: Use this to enforce the "One Question at a Time" rule and to present multiple choices.
-- `hapo:inspect`: Use this to discover codebase files and understand context.
+- `hapo:inspect`: Mandatory first pass for repo-aware brainstorming.
 - `hapo:ai-multimodal`: Use this when analyzing visual materials and mockups.
+- `hapo:generate-graph`: Use when a diagram would make architecture, flows, or trade-offs easier to validate.
 - `repomix --remote`: Use this bash command to summarize external Github repositories if a URL is provided.
 - `psql`: Query database schemas to understand existing data structures.
+- `brainstormer`: Call only for medium/high-complexity architecture trade-offs.
 - **Ecosystem Swarm (`SendMessage`):** Call `researcher` (validation), `docs-keeper` (architecture boundaries), or `project-manager` (scope warnings) for deeply complex specs.
-## The Hybrid Workflow
+## Authoritative Workflow
 ```mermaid
 flowchart TD
-    A["/hapo:inspect"] --> B[Assess Scope & Context]
-    B --> C{Multiple Subsystems?}
-    C -->|Yes| D[Force Sub-project Decomposition]
-    D --> B
-    C -->|No| E[Ask ONE Clarifying Question]
-    E --> F[Present 2-3 Solutions]
-    F --> R{Is Task Complex?}
-    R -->|Yes| S["Call researcher & docs-keeper via SendMessage"]
-    S --> G
-    R -->|No| G[Debate & Recommend Simplest Options]
-    G --> H[Present Design in Segments]
-    H --> I{User Approves Segment?}
-    I -->|No, Revise| H
-    I -->|Yes| J[Run 4-Point Spec Review]
-    J --> K{Final Approval?}
-    K -->|No| G
-    K -->|Yes| M[Write Design Doc / Report]
-    M --> L["/hapo:specs"]
-    L --> N["/hapo:journal"]
+    A["Run /hapo:inspect or narrow scout"] --> B["Summarize codebase findings"]
+    B --> C["Ask one grounded clarifying question"]
+    C --> D{"Exact requirements captured?"}
+    D -->|No| C
+    D -->|Yes| E{"Multiple independent subsystems?"}
+    E -->|Yes| F["Decompose into sub-projects"]
+    F --> C
+    E -->|No| G{"Medium/high complexity?"}
+    G -->|Yes| H["Call brainstormer/researcher/docs-keeper as needed"]
+    G -->|No| I["Propose 2-3 approaches"]
+    H --> I
+    I --> J["Recommend simplest viable option"]
+    J --> K["Present design in sections"]
+    K --> L{"User approves section?"}
+    L -->|No, revise| K
+    L -->|Yes| M["Run 4-point review"]
+    M --> N{"Final approval?"}
+    N -->|No| I
+    N -->|Yes| O["Write Design Doc / Summary Report"]
+    O --> P["Invoke /hapo:specs with report context"]
+    P --> Q["Optional /hapo:journal"]
 ```
 ## Tactical Execution Rules
-### 1. The Interrogation
-- Ask exactly **one question at a time**. Do not stack 5 bulleted questions in one response. Use `AskUserQuestion` to enforce this.
-- Structure your questions as multiple-choice evaluations where feasible.
-- Attack unexamined assumptions first. Ask "Why do you need this?" rather than just "How should we build it?"
+### 1. Scout Phase
+- Start with `hapo:inspect` for normal repositories. Use direct `Glob`/`Grep` only when the scope is tiny and obvious.
+- Do not scan the entire repository blindly. Target the user's topic.
+- If the user gives a GitHub URL, use `repomix --remote` before discussing architecture.
+- Report only useful findings, not raw file dumps.
-### 2. Trade-Off Analysis
+### 2. Discovery Phase
+- Ask exactly **one question at a time**. Do not stack multiple unrelated questions.
+- Prefer multiple-choice options grounded in scout findings.
+- Push vague intent into concrete examples, sample inputs/outputs, or acceptance criteria.
+- Challenge the first proposed solution when the underlying goal suggests a simpler path.
+### 3. Scope Guard
+- If the request spans 3+ independent subsystems, stop and decompose it.
+- Each sub-project should be able to move through brainstorm -> specs -> develop -> test -> review independently.
+- Do not design a monolithic spec when the work needs multiple specs.
+### 4. Trade-Off Analysis
 Whenever multiple approaches exist, compare them using specific dimensions:
-- Technical Debt & Maintenance Burden
-- Cognitive Complexity
-- User Experience (UX) and Developer Experience (DX)
-- Time vs. Value proportion
+- setup cost
+- runtime complexity
+- maintenance load
+- user experience and developer experience
+- compatibility and migration risk
+- time-to-value
+Always name the simplest viable option and explain the trade-off that makes it preferable.
-### 3. Visual & UI Protocols
-If the topic involves UI layouts, interactive elements, or visual styling: do not force text-only guesswork. Leverage the `hapo:ai-multimodal` skill to process mockups or structural blueprints when necessary. Prioritize visual alignment over abstract textual descriptions for frontend features.
+### 5. Visual & UI Protocols
-### 4. 4-Point Spec Review
+If the topic involves UI layouts, interactive elements, visual styling, architecture diagrams, or spatial flows:
+- Use `hapo:ai-multimodal` for supplied images, videos, PDFs, or mockups.
+- Use `hapo:generate-graph` when a diagram would make trade-offs clearer.
+- Do not force text-only guesswork for visual choices.
+- Visual aids support the brainstorm; they do not bypass user approval.
+### 6. Design Presentation
+- Present design in sections sized to complexity: Architecture, Data Flow, Interfaces, UX, Error Cases, Testing Strategy, Rollout.
+- Ask for approval after each meaningful section.
+- Keep changes tied to discovered touchpoints.
+- Do not invent unrelated refactors.
+### 7. 4-Point Spec Review
 Before passing the completed design to the user for final review, you must internally sanitize the drafted document:
 1. **Placeholder Scan:** Hunt and eliminate any "TBD", "TODO", or vague placeholder variables.
 2. **Consistency Check:** Ensure no contradictory flows exist between architecture and behavior segments.
 3. **Scope Check:** Verify the design addresses only the agreed feature bounds without uncontrolled scope creep.
 4. **Ambiguity Check:** Replace abstract claims ("we will implement logic here") with concrete instructions.
-### 5. Final Handoff & Documentation
+### 8. Final Handoff & Documentation
 Upon the user's explicit final approval of the sanitized design document:
 1. Generate the final **Design Doc / Summary Report**.
-2. Immediately invoke `/hapo:specs` to hand off the project into the implementation planning phase based on your report.
-3. Conclude by optionally invoking `/hapo:journal` if the project context should be persisted for future developer memory.
+2. Include: problem statement, exact requirements, evaluated approaches, recommended solution, risks, validation criteria, and next steps.
+3. Invoke `/hapo:specs` with the report context to hand off into CafeKit's structured specification phase.
+4. Optionally invoke `/hapo:journal` if the project context should be persisted for future developer memory.
+## Completion Bar
+You are done only when:
+- Scout findings were summarized.
+- The five exact requirement fields are concrete.
+- The selected design was approved by the user.
+- The 4-point review found no blocking gaps.
+- The summary/report is ready for `/hapo:specs`.

package/src/claude/skills/chrome-devtools/scripts/package.json CHANGED Viewed

@@ -3,7 +3,9 @@
   "version": "1.1.0",
   "description": "Browser automation scripts for Chrome DevTools Agent Skill",
   "type": "module",
-  "scripts": {},
+  "scripts": {
+    "test": "node --test __tests__/*.test.js"
+  },
   "dependencies": {
     "debug": "^4.4.0",
     "puppeteer": "^24.15.0",

package/src/claude/skills/code-review/references/spec-compliance-review.md CHANGED Viewed

@@ -24,7 +24,7 @@ Do not attempt a standard text-based review if the project includes Visual Specs
 3. If NO (Markdown Spec only): Read the spec directly and extract:
    - requirement bullets
    - task `Completion Criteria`
-   - task `Verification & Evidence`
+   - task `Task Test Plan & Verification Evidence` (or legacy `Verification & Evidence`)
    - canonical contracts/invariants from `design.md`
    Then verify the changed files against those concrete obligations.

package/src/claude/skills/develop/SKILL.md CHANGED Viewed

@@ -45,7 +45,7 @@ DO NOT write implementation code until an approved spec exists.
 <DEFINITION-OF-DONE>
 A task is NOT done because code compiles or a placeholder renders.
-A task is done only when the task file's Completion Criteria AND Verification & Evidence section are satisfied with real execution proof.
+A task is done only when the task file's Completion Criteria AND Task Test Plan & Verification Evidence section are satisfied with real execution proof. Existing specs may use legacy `Verification & Evidence`; treat that as the same contract.
 </DEFINITION-OF-DONE>
 <CONTRACT-FIDELITY>
@@ -85,7 +85,7 @@ flowchart TD
   - Objective + Constraints
   - Related Files
   - Completion Criteria
-  - Verification & Evidence
+  - Task Test Plan & Verification Evidence (or legacy Verification & Evidence)
   - Exact executable verification commands named in the task
   - Requirement IDs referenced by the task
   - Named technologies, frameworks, protocols, and data stores that the task/spec explicitly requires
@@ -118,7 +118,7 @@ The moment you finish coding, DO NOT proceed further. Switch to `references/qual
 **Mantra:** All feedback from code-auditor must be addressed thoroughly: Score >= 9.5 & Zero Critical issues.
 - Passing Step 4 requires ALL of the following:
-  1. Automated verification passes, including preflight compile/typecheck/build health and every exact command named in the task's `Verification & Evidence` section
+  1. Automated verification passes, including preflight compile/typecheck/build health and every exact command named in the task's `Task Test Plan & Verification Evidence` section (or legacy `Verification & Evidence`)
   2. Code review passes
   3. Task evidence passes (artifacts/runtime surfaces/negative-path checks from the task file are proven)
 - `PRECHECK_FAIL` outranks `NO_TESTS`. If compile/typecheck/build fails, the task is FAIL even when no test suite exists yet.
@@ -135,7 +135,7 @@ The moment you finish coding, DO NOT proceed further. Switch to `references/qual
   - `spec.json.task_registry[path].status = "done"`
   - `completed_at` + `last_updated_at`
   - synchronized top-level `updated_at`
-  - a human-readable verification receipt inside the task's `Verification & Evidence` section showing which commands ran, their outcomes, and what proof was observed
+  - a human-readable verification receipt inside the task's `Task Test Plan & Verification Evidence` section showing which commands ran, their outcomes, and what proof was observed
 - Verification receipts with `PRECHECK_FAIL`, `FAIL`, `UNVERIFIED`, or an explicit note that the implementation intentionally simplified a named contract MUST NOT be synchronized as `done`.
 - After syncing the active task, run a **Task Closeout Docs Checkpoint**
 - Task Closeout Docs Checkpoint:

package/src/claude/skills/develop/references/quality-gate.md CHANGED Viewed

@@ -9,7 +9,7 @@ Green tests are NOT enough. The gate requires three proofs:
 ## Automation Semantics
-- If the task names exact commands in `Verification & Evidence`, those exact commands are mandatory and must run before any fallback repo defaults.
+- If the task names exact commands in `Task Test Plan & Verification Evidence` (or legacy `Verification & Evidence`), those exact commands are mandatory and must run before any fallback repo defaults.
 - Preflight compile/typecheck/build health is mandatory. If compile/typecheck/build fails before tests are meaningful, the gate result is `PRECHECK_FAIL`, not `NO_TESTS`.
 - `NO_TESTS` is never an automatic PASS.
 - `NO_TESTS` is acceptable only when the task does **not** require a dedicated test suite command and every other required automated command/evidence item passes.
@@ -26,7 +26,7 @@ Variable: retry_count = 0
 Before START_LOOP:
   - Read the active task file(s)
-  - Extract Related Files, Completion Criteria, Verification & Evidence
+  - Extract Related Files, Completion Criteria, Task Test Plan & Verification Evidence (or legacy Verification & Evidence)
   - Extract the exact executable verification commands in declaration order
   - Extract relevant design contracts/invariants for the touched area
   - If any of these are missing or too vague to verify, FAIL immediately and route back to spec correction

package/src/claude/skills/git/SKILL.md CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 name: hapo:git
-description: "Hapo Native Git Operations & Worktree Management. Handles safe commits, conventional split, secret scanning, and sibling-branch worktrees locally."
-argument-hint: "commit | push | pr | worktree <feature-desc>"
+description: "Hapo Native Git Operations & Worktree Management. Handles safe commits, conventional split, branch finish choices, secret scanning, and sibling-branch worktrees locally."
+argument-hint: "commit | push | pr | finish | worktree <feature-desc>"
 version: "1.0.0"
 ---
@@ -17,6 +17,7 @@ This skill merges Version Control Systems (VSC) capabilities and parallel Worktr
 - **`commit`**: Scan secrets, analyze diff, auto-split chunks into conventional commits.
 - **`push`**: Securely push to the current branch.
 - **`pr`**: Propose merging current feature into `develop` or `main`.
+- **`finish`**: Verify current branch/worktree, then present explicit finish options.
 ### 2. Worktree Engine (Safe Parallel Development)
 - **`worktree <feature-description>`**: Creates a sibling directory alongside the current repo to isolate dependencies and database contexts without dirtying the main worktree.
@@ -38,7 +39,23 @@ This skill merges Version Control Systems (VSC) capabilities and parallel Worktr
 - Never run branching logic within the same root if the task requires heavy, isolated setup.
 - Worktrees MUST be created as *sibling directories*: `../<project-name>-<feature-branch>`.
+### 4. Finish Branch Protocol
+- Before merge, push, PR, or discard, run a fresh status and verification check:
+  ```bash
+  git status --short
+  git branch --show-current
+  ```
+- If uncommitted changes exist, route through `commit` or ask whether to keep them unstaged.
+- Present explicit finish options:
+  1. Merge locally into the target branch.
+  2. Push current branch and open/update a PR.
+  3. Keep the branch/worktree for later.
+  4. Discard branch/worktree only after typed confirmation.
+- Never force-push or delete worktrees without explicit user confirmation.
+- If running inside a git worktree, detect it with `git worktree list` and clean up only the worktree created for this task.
 ## References & Protocols
 - `references/commit-protocols.md`: Strict rules for analyzing Git Diffs and determining commit splitting behavior.
+- `references/finish-branch.md`: Finish branch / PR / worktree closeout protocol.
 - `references/worktree-blueprint.md`: The 4-step Bash process to accurately construct an out-of-bounds Git Worktree Environment.

package/src/claude/skills/git/references/finish-branch.md ADDED Viewed

@@ -0,0 +1,61 @@
+# Finish Branch Protocol
+Use this protocol for `hapo:git finish`.
+## Goal
+Finish the current branch deliberately. Do not assume the user wants merge, push, PR, keep, or discard.
+## Step 1: Inspect Current State
+Run:
+```bash
+git rev-parse --show-toplevel
+git branch --show-current
+git status --short
+git worktree list
+```
+Identify:
+- current branch
+- whether this is a linked worktree
+- uncommitted changes
+- upstream tracking branch
+- likely target branch (`main`, `develop`, or user-provided target)
+## Step 2: Verify Before Finish
+Before any merge/push/PR/discard recommendation, run the task-required verification or the repository default checks. If verification is not available, report `NO_TESTS` / `UNVERIFIED` honestly.
+Never claim a branch is ready from stale output.
+## Step 3: Present Finish Options
+Offer explicit options:
+1. **Merge locally** — merge current branch into target branch.
+2. **Push / PR** — push current branch and create or update a pull request.
+3. **Keep** — leave branch/worktree as-is for later work.
+4. **Discard** — delete branch/worktree only after explicit typed confirmation.
+Do not select an option silently unless the user already requested it.
+## Step 4: Safety Rules
+- Never force-push by default.
+- Never delete a branch with unmerged commits without typed confirmation.
+- Never remove a worktree you did not create or cannot identify.
+- Never discard uncommitted changes without explicit typed confirmation.
+- If verification fails, recommend fix/retest before merge or PR.
+## Step 5: Completion Report
+Report:
+- branch and target
+- chosen finish action
+- verification commands and result
+- commit/push/PR result
+- remaining risks or unresolved questions

package/src/claude/skills/pdf/scripts/__pycache__/check_bounding_boxes.cpython-314.pyc ADDED Viewed

Binary file

package/src/claude/skills/specs/SKILL.md CHANGED Viewed

@@ -85,7 +85,8 @@ Display selection menu via `AskUserQuestion`:
 ### When called WITH a feature description
 System auto-analyzes the description:
-- If description is too short (< 20 words) or vague → stop and ask 1-2 clarifying questions
+- If description is too short (< 20 words) or missing one concrete detail → stop and ask 1-2 clarifying questions
+- If the idea has unresolved architecture choices, unclear acceptance criteria, unclear scope boundaries, or multiple plausible approaches → stop and route to `/hapo:brainstorm <idea>` before creating spec artifacts
 - If task is simple (small bugfix, config change) → suggest "A spec may not be needed for this. Continue anyway?"
 - If task is complex (multi-module, security/migration related) → auto-activate deep research, ask user 3 scope questions
@@ -111,7 +112,9 @@ flowchart TD
     A["Call /hapo:specs"] --> B{Has description?}
     B -->|No| C["Menu: init / status / resume / --validate / archive"]
     B -->|Yes| D["Step 1: Analyze description"]
-    D --> E{Clear enough?}
+    D --> DB{"Needs pre-spec brainstorm?"}
+    DB -->|Yes| DB2["Stop: run /hapo:brainstorm with same idea"]
+    DB -->|No| E{Clear enough?}
     E -->|No| F["Ask user 1-2 clarifying questions"]
     F --> D
     E -->|Yes| G["Step 2: Scan specs/ for related specs"]
@@ -148,6 +151,12 @@ flowchart TD
 ### Step 1: Analyze Description
 - Assess clarity and complexity of the description
+- Route to `hapo:brainstorm` before creating files when:
+  - the expected output or acceptance criteria are not concrete
+  - the scope boundary is unknown
+  - the request has 2-3 viable architectures and no user-approved direction
+  - the feature spans 3+ independent subsystems and needs decomposition
+  - the user is explicitly asking to explore, compare, debate, or decide
 - **Multimodal & Document Auto-Ingestion (MANDATORY)**: If the input includes file paths or URLs pointing to images, audio, video, or Office documents, you MUST spawn the matching subagent to extract content BEFORE proceeding:
   - `.mp3`, `.wav`, `.mp4`, `.mov`, `.jpg`, `.png`, `.webp` → `Task(subagent_type="hapo:ai-multimodal", prompt="Transcribe/Analyze [path]")`
   - `.pdf` → `Task(subagent_type="hapo:pdf", prompt="Extract text and tables from [path]")`
@@ -218,7 +227,7 @@ Load: `references/scope-inquiry.md`
 - Load `rules/tasks-generation.md` for core principles
 - Load `rules/tasks-parallel-analysis.md` for parallel markers (default: enabled)
 - Each task file follows template `templates/task.md`
-- Each task file MUST include `Completion Criteria` and `Verification & Evidence` sections detailed enough that a downstream quality gate can prove the task is truly done.
+- Each task file MUST include `Completion Criteria` and `Task Test Plan & Verification Evidence` sections detailed enough that a downstream quality gate can prove the task is truly done.
 - Build `spec.json.task_registry` alongside `task_files`. For each task file, register at minimum:
   - `id`
   - `title`
@@ -311,8 +320,8 @@ Load: `references/review.md` + `rules/design-review.md`
 - FAIL if any task file exists on disk but is missing from `task_registry`
 - FAIL if any path in `task_registry` does not exist on disk
 - FAIL if any requirement or NFR mapping uses non-numeric labels (`NFR-1`, `SEC-1`, etc.)
-- FAIL if a task lacks `Completion Criteria` or `Verification & Evidence`
-- FAIL if accepted validation decisions exist in reports but are not reflected in the implementation-facing sections of affected artifacts (`Objective`, `Constraints`, `Implementation Steps`, `Completion Criteria`, `Verification & Evidence`, canonical contracts, or requirements text).
+- FAIL if a task lacks `Completion Criteria` or `Task Test Plan & Verification Evidence` (legacy `Verification & Evidence` is accepted only for pre-existing task files)
+- FAIL if accepted validation decisions exist in reports but are not reflected in the implementation-facing sections of affected artifacts (`Objective`, `Constraints`, `Implementation Steps`, `Completion Criteria`, `Task Test Plan & Verification Evidence`, canonical contracts, or requirements text).
 - FAIL if the spec scope/provider was switched away from Anthropic/Claude but `requirements.md`, `design.md`, or `tasks/*.md` still contain stale provider-specific strings such as `Claude API`, `Haiku`, or `haiku_reachable`. `research.md` is the only allowed place for historical cost comparisons.
 - FAIL if privacy/delete-data work lacks a single canonical deletion policy. The design MUST explicitly choose either:
   1. hard-delete with no re-registration lock, or
@@ -446,7 +455,7 @@ Before finalizing any specification, assert all the following:
 - [ ] **Requirements traceability** matrix present in design.md
 - [ ] **Canonical Contracts & Invariants** filled for auth/transport/persistence/artifact-sensitive work
 - [ ] **Every task file** maps to at least 1 valid in-scope requirement ID
-- [ ] **Every task file** includes `Verification & Evidence` with executable or inspectable proof
+- [ ] **Every task file** includes `Task Test Plan & Verification Evidence` with executable or inspectable proof
 - [ ] **State Machine Blueprint:** design.md contains Mermaid diagrams for non-trivial flows
 - [ ] **Dependency graph complete**: no task can start before its blockers are listed
 - [ ] **Risk matrix filled**: likelihood × impact, with mitigation for High items

package/src/claude/skills/specs/references/review.md CHANGED Viewed

@@ -42,7 +42,7 @@ These rules override any self-reasoning or optimization the system may attempt:
 4. **Apply YAGNI to fixes.** When user says "configure later" or "decide later", add a single note to the task file. Do NOT generate multiple concrete implementations (e.g., 4 provider files when user only asked for abstraction).
 5. **No false completion.** You MUST NOT set `validation.status = "completed"` or `ready_for_implementation = true` until a reconciliation audit proves the accepted findings and validation decisions are reflected in the physical spec artifacts.
 6. **Provider drift is a real defect.** If the scope changed away from Claude/Anthropic, stale strings like `Claude API`, `Haiku`, or `haiku_reachable` in `requirements.md`, `design.md`, or `tasks/*.md` are validation failures. `research.md` may mention them only as historical comparison.
-7. **Implementation-facing propagation is mandatory.** A decision that affects implementation is NOT considered applied if it only appears in `Risk Assessment`, `validate-log.md`, or `red-team-report.md`. It must update at least one of: `requirements.md`, `Canonical Contracts & Invariants`, `Objective`, `Constraints`, `Implementation Steps`, `Completion Criteria`, or `Verification & Evidence`.
+7. **Implementation-facing propagation is mandatory.** A decision that affects implementation is NOT considered applied if it only appears in `Risk Assessment`, `validate-log.md`, or `red-team-report.md`. It must update at least one of: `requirements.md`, `Canonical Contracts & Invariants`, `Objective`, `Constraints`, `Implementation Steps`, `Completion Criteria`, or `Task Test Plan & Verification Evidence`.
 ---