@haposoft/cafekit 0.8.0 → 0.8.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (42) hide show
  1. package/README.md +2 -0
  2. package/package.json +6 -3
  3. package/src/claude/CLAUDE.md +1 -0
  4. package/src/claude/agents/debugger.md +58 -4
  5. package/src/claude/agents/docs-keeper.md +1 -1
  6. package/src/claude/agents/god-developer.md +2 -2
  7. package/src/claude/agents/project-manager.md +1 -1
  8. package/src/claude/agents/spec-maker.md +22 -19
  9. package/src/claude/agents/test-runner.md +1 -0
  10. package/src/claude/agents/ui-ux-designer.md +3 -3
  11. package/src/claude/migration-manifest.json +1 -0
  12. package/src/claude/references/debugger/condition-based-waiting.md +56 -0
  13. package/src/claude/references/debugger/frontend-verification.md +59 -0
  14. package/src/claude/references/debugger/performance-diagnostics.md +76 -0
  15. package/src/claude/references/debugger/side-effect-gate.md +48 -0
  16. package/src/claude/rules/manage-docs.md +2 -2
  17. package/src/claude/settings/settings.json +1 -1
  18. package/src/claude/skills/ai-multimodal/SKILL.md +1 -1
  19. package/src/claude/skills/brainstorm/SKILL.md +2 -2
  20. package/src/claude/skills/chrome-devtools/SKILL.md +1 -1
  21. package/src/claude/skills/code-review/SKILL.md +1 -1
  22. package/src/claude/skills/debug/SKILL.md +216 -0
  23. package/src/claude/skills/develop/SKILL.md +1 -1
  24. package/src/claude/skills/develop/references/quality-gate.md +3 -3
  25. package/src/claude/skills/develop/references/subagent-patterns.md +10 -10
  26. package/src/claude/skills/frontend-design/SKILL.md +1 -1
  27. package/src/claude/skills/hotfix/SKILL.md +30 -10
  28. package/src/claude/skills/hotfix/references/diagnosis-protocol.md +28 -4
  29. package/src/claude/skills/hotfix/references/parallel-patterns.md +13 -13
  30. package/src/claude/skills/hotfix/references/prevention-gate.md +8 -1
  31. package/src/claude/skills/hotfix/references/workflow-specialized.md +3 -1
  32. package/src/claude/skills/inspect/SKILL.md +2 -2
  33. package/src/claude/skills/inspect/references/external-gemini-inspection.md +11 -11
  34. package/src/claude/skills/research/SKILL.md +1 -1
  35. package/src/claude/skills/specs/SKILL.md +29 -16
  36. package/src/claude/skills/specs/references/codebase-analysis.md +34 -3
  37. package/src/claude/skills/specs/references/research-strategy.md +54 -7
  38. package/src/claude/skills/specs/templates/research.md +46 -0
  39. package/src/claude/skills/test/SKILL.md +1 -1
  40. package/src/claude/skills/ai-multimodal/scripts/.coverage +0 -0
  41. package/src/claude/skills/ai-multimodal/scripts/tests/.coverage +0 -0
  42. package/src/claude/skills/pdf/scripts/__pycache__/check_bounding_boxes.cpython-314.pyc +0 -0
@@ -11,7 +11,7 @@ SCALE ≥ 6 → Use internal discovery instead
11
11
 
12
12
  ## Configuration
13
13
 
14
- Read from `packages/spec/src/claude/runtime.json`:
14
+ Read from `.claude/runtime.json`:
15
15
  ```json
16
16
  {
17
17
  "gemini": {
@@ -62,17 +62,17 @@ If not installed, ask user:
62
62
  1. **Yes** - Provide installation instructions (may need manual auth steps)
63
63
  2. **No** - Fall back to internal discovery (`internal-inspection.md`)
64
64
 
65
- ## Spawning Parallel Bash Agents
65
+ ## Running Parallel Gemini Commands
66
66
 
67
- Use `Agent` tool with `subagent_type: "Bash"` to spawn parallel agents:
67
+ Use the `Bash` tool for each scoped Gemini command. Run them in the same tool turn when the runtime supports parallel tool calls; otherwise run them sequentially. Do not use `Agent` with `subagent_type: "Bash"` `Bash` is a tool, not a subagent.
68
68
 
69
69
  ```
70
- Agent 1: subagent_type="Bash", prompt="Run: gemini -y -m gemini-3-flash-preview '[prompt1]'"
71
- Agent 2: subagent_type="Bash", prompt="Run: gemini -y -m gemini-3-flash-preview '[prompt2]'"
72
- Agent 3: subagent_type="Bash", prompt="Run: gemini -y -m gemini-3-flash-preview '[prompt3]'"
70
+ Bash: gemini -y -m gemini-3-flash-preview '[prompt1]'
71
+ Bash: gemini -y -m gemini-3-flash-preview '[prompt2]'
72
+ Bash: gemini -y -m gemini-3-flash-preview '[prompt3]'
73
73
  ```
74
74
 
75
- Spawn all in single message for parallel execution.
75
+ Group by independent scopes so each command can return a focused report.
76
76
 
77
77
  ## Prompt Guidelines
78
78
 
@@ -107,11 +107,11 @@ Do not expand beyond the provided scope.
107
107
 
108
108
  User: "Find database migration files" with `ext`
109
109
 
110
- Spawn 3 parallel Bash agents via Agent tool:
110
+ Run 3 scoped Gemini CLI commands:
111
111
  ```
112
- Agent 1 (Bash): "Run: gemini -y -m gemini-3-flash-preview 'Search db/, migrations/ for migration files'"
113
- Agent 2 (Bash): "Run: gemini -y -m gemini-3-flash-preview 'Search lib/, src/ for database schema files'"
114
- Agent 3 (Bash): "Run: gemini -y -m gemini-3-flash-preview 'Search config/ for database configuration'"
112
+ Bash: gemini -y -m gemini-3-flash-preview 'Search db/, migrations/ for migration files'
113
+ Bash: gemini -y -m gemini-3-flash-preview 'Search lib/, src/ for database schema files'
114
+ Bash: gemini -y -m gemini-3-flash-preview 'Search config/ for database configuration'
115
115
  ```
116
116
 
117
117
  ## Reading File Content
@@ -19,7 +19,7 @@ Before delegating, briefly assess the `[topic]`.
19
19
  - If solid, proceed.
20
20
 
21
21
  ### Phase 2: Agent Delegation
22
- Call the `TaskCreate` tool to spin up the `researcher` subagent.
22
+ Call the `Agent` tool to invoke the `researcher` subagent. Use `TaskCreate` only for task-list tracking when the workflow needs persistent task state.
23
23
  **Instructions to pass to Researcher:**
24
24
  ```text
25
25
  Conduct comprehensive research on: [topic]
@@ -11,10 +11,10 @@ argument-hint: "<feature-description> | status | resume | --validate | archive"
11
11
 
12
12
  ## Overview
13
13
 
14
- This skill provides a 10-step workflow to transform ideas into specs:
14
+ This skill provides a 10-step workflow to transform ideas into evidence-backed specs:
15
15
 
16
16
  ```
17
- Analyze → Dependency Scan → Complexity Assessment → Init → Requirements → Design → Tasks → Hydration → Review → Completion
17
+ Analyze → Dependency Scan → Complexity Assessment → Init → Evidence Gate + Requirements → Design → Tasks → Hydration → Review → Completion
18
18
  ```
19
19
 
20
20
  **CRITICAL:** Before starting, the system MUST:
@@ -46,6 +46,7 @@ Analyze → Dependency Scan → Complexity Assessment → Init → Requirements
46
46
  - `task_files` in `spec.json` MUST exactly match the real files under `tasks/` after Step 7.
47
47
  - `task_registry` in `spec.json` MUST exist once task files are generated and MUST contain one entry per task file, keyed by relative path.
48
48
  - `ready_for_implementation` is a hard gate, not a convenience flag. Never set it before the finalization audit passes.
49
+ - Non-trivial specs MUST have an evidence trail in `research.md` before requirements, design, or tasks are finalized. Evidence can be codebase scout findings, external/current research, or an explicit skip rationale.
49
50
 
50
51
  ### Output Criteria
51
52
  - Never implement code — only create spec documents
@@ -89,6 +90,7 @@ System auto-analyzes the description:
89
90
  - If the idea has unresolved architecture choices, unclear acceptance criteria, unclear scope boundaries, or multiple plausible approaches → stop and route to `/hapo:brainstorm <idea>` before creating spec artifacts
90
91
  - If task is simple (small bugfix, config change) → suggest "A spec may not be needed for this. Continue anyway?"
91
92
  - If task is complex (multi-module, security/migration related) → auto-activate deep research, ask user 3 scope questions
93
+ - For non-trivial specs, execute the Step 5 Evidence Gate before writing final requirements. Do not design from memory when codebase or current external evidence can answer the question.
92
94
 
93
95
  ### When called WITH `--validate` argument
94
96
 
@@ -128,11 +130,11 @@ flowchart TD
128
130
  H3 -->|No| K["Keep default scope"]
129
131
  J --> L["Step 4: Init — create specs/<feature>/"]
130
132
  K --> L
131
- L --> M["Step 5: Requirementswrite EARS"]
132
- M --> N{Need deep research?}
133
- N -->|Yes| O["Research: researchers + inspector + docs"]
134
- O --> P["Write research.md"]
135
- N -->|No| P
133
+ L --> M["Step 5A: Evidence Gate scout + research"]
134
+ M --> N{Evidence sufficient?}
135
+ N -->|No| O["Ask user / run targeted scout / external research"]
136
+ O --> M
137
+ N -->|Yes| P["Step 5B: Requirements — write EARS"]
136
138
  P --> Q["Step 6: Design — pick discovery mode"]
137
139
  Q --> R["Write design.md"]
138
140
  R --> S["Step 7: Tasks — split into individual files"]
@@ -157,12 +159,12 @@ flowchart TD
157
159
  - the request has 2-3 viable architectures and no user-approved direction
158
160
  - the feature spans 3+ independent subsystems and needs decomposition
159
161
  - the user is explicitly asking to explore, compare, debate, or decide
160
- - **Multimodal & Document Auto-Ingestion (MANDATORY)**: If the input includes file paths or URLs pointing to images, audio, video, or Office documents, you MUST spawn the matching subagent to extract content BEFORE proceeding:
161
- - `.mp3`, `.wav`, `.mp4`, `.mov`, `.jpg`, `.png`, `.webp` → `Task(subagent_type="hapo:ai-multimodal", prompt="Transcribe/Analyze [path]")`
162
- - `.pdf` → `Task(subagent_type="hapo:pdf", prompt="Extract text and tables from [path]")`
163
- - `.docx` → `Task(subagent_type="hapo:docx", prompt="Extract content from [path]")`
164
- - `.pptx` → `Task(subagent_type="hapo:pptx", prompt="Extract slide content from [path]")`
165
- - `.xlsx`, `.csv` → `Task(subagent_type="hapo:xlsx", prompt="Extract data from [path]")`
162
+ - **Multimodal & Document Auto-Ingestion (MANDATORY)**: If the input includes file paths or URLs pointing to images, audio, video, or Office documents, activate the matching skill workflow to extract content BEFORE proceeding:
163
+ - `.mp3`, `.wav`, `.mp4`, `.mov`, `.jpg`, `.png`, `.webp` → use `hapo:ai-multimodal` to transcribe/analyze the file
164
+ - `.pdf` → use `hapo:pdf` to extract text and tables
165
+ - `.docx` → use `hapo:docx` to extract document content
166
+ - `.pptx` → use `hapo:pptx` to extract slide content
167
+ - `.xlsx`, `.csv` → use `hapo:xlsx` to extract data
166
168
  - *Append the extracted findings into your working memory as the enriched "description".*
167
169
  - If description < 20 words or lacks concrete nouns → ask 1-2 clarifying questions
168
170
  - If task is too simple → warn user that a spec may not be needed
@@ -194,12 +196,19 @@ Load: `references/scope-inquiry.md`
194
196
  - `expansion_policy`: `requires-user-approval`
195
197
  - Do NOT generate requirements, design, or tasks at this step
196
198
 
197
- ### Step 5: Requirements & Research
199
+ ### Step 5: Evidence Gate, Requirements & Research
198
200
  - Read `spec.json` — stop if init hasn't completed
199
201
  - Stop if requirements already exist, unless user wants to regenerate
200
202
  - Respect `scope_lock` — keep new requirements within `in_scope`
201
- - Analyze existing codebase if this is an enhancement (not greenfield)
202
- - **MANDATORY Research:** Spawn `researcher` subagent to gather best practices, documentation, and technical foundation before detailing requirements. Use `Task(subagent_type="researcher", prompt="Research [feature]", description="Research")`.
203
+ - Load `references/research-strategy.md` and `references/codebase-analysis.md`
204
+ - Classify evidence needs before writing requirements:
205
+ - **Targeted codebase scout is mandatory** when the spec changes existing behavior, touches an API/CLI/package export/schema/auth/session/permission/config/hook/runtime contract, lacks exact file paths, may invalidate tests, resumes an older spec, or crosses monorepo/package/runtime boundaries.
206
+ - **External/current research is mandatory** when the spec depends on third-party APIs, libraries, platform policies, AI providers/models/tooling, security/auth/payment/privacy/delete-data rules, performance/accessibility/SEO/security standards, or the user asks for "best", "optimal", "latest", "recommended", or equivalent.
207
+ - **Skip evidence gathering only** for trivial one-file edits, internal text/docs changes, isolated new files with no integration points, or decisions already backed by a recent user-provided report. Record the skip rationale in `research.md`.
208
+ - Codebase scout must be targeted, not a blind full-repo crawl. Identify relevant files/modules, current patterns, existing tests, contracts, and likely blast radius.
209
+ - External/current research must prefer official docs, standards, primary sources, or maintained upstream references. Record source links and the date/context of the finding.
210
+ - Write `research.md` before final requirements. It MUST include an Evidence Summary with: codebase scout result, external research result or skip rationale, selected decision, rejected alternatives, remaining gaps, and downstream task/test implications.
211
+ - If evidence exposes unresolved architecture choices, unclear acceptance criteria, or multiple viable approaches with no obvious winner, stop and route to `/hapo:brainstorm` instead of forcing a spec.
203
212
  - Write requirements in **EARS** format (see `rules/ears-format.md`)
204
213
  - **Feasibility Check:** Cross-check each requirement against known technical constraints from `research.md`.
205
214
  - Each requirement gets a unique numeric ID
@@ -217,6 +226,7 @@ Load: `references/scope-inquiry.md`
217
226
  - **full**: integration, security, schema, or performance
218
227
  - Record findings in `research.md` before finalizing design
219
228
  - Write `design.md` from template `templates/design.md` (see `rules/design-principles.md`)
229
+ - Design decisions MUST trace back to `research.md` evidence. If a design choice lacks evidence and is not a user-approved constraint, gather more evidence or ask the user before finalizing.
220
230
  - Add diagrams only when design has multi-step or cross-boundary flows
221
231
  - For auth/session, transport/entrypoint, persistence/schema, generated-artifact, or runtime-sensitive work, the design MUST fill the `Canonical Contracts & Invariants` section and tasks MUST inherit the same decisions verbatim.
222
232
  - Update `spec.json` phase, timestamps, discovery mode
@@ -227,6 +237,7 @@ Load: `references/scope-inquiry.md`
227
237
  - Load `rules/tasks-generation.md` for core principles
228
238
  - Load `rules/tasks-parallel-analysis.md` for parallel markers (default: enabled)
229
239
  - Each task file follows template `templates/task.md`
240
+ - `Related Files` and test plans must inherit paths, contracts, and test targets from the codebase scout. If exact files/tests cannot be named for an enhancement, run targeted inspect before generating tasks.
230
241
  - Each task file MUST include `Completion Criteria` and `Task Test Plan & Verification Evidence` sections detailed enough that a downstream quality gate can prove the task is truly done.
231
242
  - Build `spec.json.task_registry` alongside `task_files`. For each task file, register at minimum:
232
243
  - `id`
@@ -319,6 +330,7 @@ Load: `references/review.md` + `rules/design-review.md`
319
330
  - FAIL if any path in `task_files` does not exist on disk
320
331
  - FAIL if any task file exists on disk but is missing from `task_registry`
321
332
  - FAIL if any path in `task_registry` does not exist on disk
333
+ - FAIL if a newly generated non-trivial spec lacks a `research.md` Evidence Summary with codebase scout result, external research result or skip rationale, selected decision, rejected alternatives, and downstream task/test implications.
322
334
  - FAIL if any requirement or NFR mapping uses non-numeric labels (`NFR-1`, `SEC-1`, etc.)
323
335
  - FAIL if a task lacks `Completion Criteria` or `Task Test Plan & Verification Evidence` (legacy `Verification & Evidence` is accepted only for pre-existing task files)
324
336
  - FAIL if accepted validation decisions exist in reports but are not reflected in the implementation-facing sections of affected artifacts (`Objective`, `Constraints`, `Implementation Steps`, `Completion Criteria`, `Task Test Plan & Verification Evidence`, canonical contracts, or requirements text).
@@ -449,6 +461,7 @@ specs/
449
461
  ### Pre-Finalization Checklist
450
462
  Before finalizing any specification, assert all the following:
451
463
  - [ ] **scope_lock** initialized and respected throughout all phases
464
+ - [ ] **Evidence Summary** exists in `research.md` with codebase scout, external research or skip rationale, selected decision, rejected alternatives, and task/test implications
452
465
  - [ ] **EARS format** applied to all acceptance criteria in requirements.md
453
466
  - [ ] **Numeric requirement IDs** assigned to every requirement
454
467
  - [ ] **Discovery mode** selected and recorded in spec.json.design_context
@@ -2,11 +2,26 @@
2
2
 
3
3
  ## Purpose
4
4
 
5
- Understand the current codebase before designing solutions — ensure the new spec aligns with existing architecture, patterns, and conventions.
5
+ Understand the current codebase before designing solutions — ensure the new spec aligns with existing architecture, patterns, contracts, tests, and runtime boundaries.
6
6
 
7
7
  ## Skip Conditions
8
8
 
9
9
  - Already provided with inspector reports → skip, use directly
10
+ - Greenfield artifact that does not integrate with existing code → record skip rationale in `research.md`
11
+ - Internal docs/text-only spec with no runtime behavior → record skip rationale in `research.md`
12
+
13
+ ## Targeted Codebase Scout Gate
14
+
15
+ Run a targeted scout before requirements when any of these are true:
16
+
17
+ - The feature modifies existing behavior, UI, API, CLI, data flow, runtime config, hooks, settings, generated artifacts, or package exports.
18
+ - The feature touches database schemas, migrations, auth/session, permissions, external integrations, or shared contracts.
19
+ - The task may break existing tests, snapshots, build scripts, type checks, e2e flows, or docs generation.
20
+ - The spec crosses monorepo boundaries such as source package → installed `.claude/`, library package → docs app, package manifest → publish/install runtime.
21
+ - `Related Files` cannot be named precisely yet.
22
+ - A resumed or validated spec may be stale because files, tests, dependencies, or contracts changed since the spec was created.
23
+
24
+ The scout must be narrow and question-driven. Do not scan the whole repo just because the repo is available.
10
25
 
11
26
  ## 4 Mandatory Files to Read First
12
27
 
@@ -22,7 +37,20 @@ Understand the current codebase before designing solutions — ensure the new sp
22
37
  2. **The "Blind Flight" Halt:** If ALL 4 mandatory docs are missing in a non-empty repository:
23
38
  - **DO NOT** blindly use `inspector` to scan the whole repo.
24
39
  - **HALT** the spec process immediately.
25
- - Ask the User: *"No codebase documentation found. Exploring blind will drain tokens and produce inaccurate specs. Shall I trigger `docs-keeper` or `/hapo:docs` to generate a baseline `codebase-summary.md` first?"*
40
+ - Ask the User: *"No codebase documentation found. Exploring blind will drain tokens and produce inaccurate specs. Shall I call the `docs-keeper` agent to generate a baseline `codebase-summary.md` first?"*
41
+
42
+ ## Scout Output Contract
43
+
44
+ Record the concise findings in `research.md`; if inspector agents are used, save detailed output to `reports/inspect-report.md`.
45
+
46
+ Required output:
47
+ - **Project surface:** project type, package/workspace boundaries, languages, frameworks, and relevant commands.
48
+ - **Relevant files/modules:** exact paths likely to be created, modified, deleted, or read.
49
+ - **Existing patterns:** naming, architecture, state/data flow, error handling, testing, and docs conventions that tasks must follow.
50
+ - **Contracts:** API/CLI/schema/auth/config/runtime/package/export contracts affected by the spec.
51
+ - **Tests and verification:** existing tests/checks likely to pass, fail, or require updates.
52
+ - **Blast radius:** affected modules, consumers, generated artifacts, publish/install paths, and rollback considerations.
53
+ - **Staleness check:** docs or prior specs that conflict with source code or manifests.
26
54
 
27
55
  ## Analysis Activities
28
56
 
@@ -38,7 +66,7 @@ Before designing any logic, you must identify and read the existing schemas:
38
66
  - Identify Global State setups (Redux stores, Zustand, React Context).
39
67
  - Output the relational impact: How will the new feature alter existing tables or state structures?
40
68
 
41
- ### 2. Pattern Recognition
69
+ ### 3. Pattern Recognition
42
70
  - Study existing patterns in codebase
43
71
  - Identify conventions and architectural decisions
44
72
  - Note consistency in implementation approaches
@@ -61,6 +89,7 @@ Write a "Collateral Damage" section in your `research.md`:
61
89
  - Each inspector targets a specific aspect of the task
62
90
  - Wait for all inspectors to report before analysis
63
91
  - Save results to `reports/inspect-report.md`
92
+ - If the scout cannot name exact paths/tests after inspection, stop and ask a grounded question instead of generating vague tasks
64
93
 
65
94
  ## Best Practices
66
95
 
@@ -69,3 +98,5 @@ Write a "Collateral Damage" section in your `research.md`:
69
98
  - Document patterns found for consistency
70
99
  - Note any inconsistencies or technical debt
71
100
  - Consider impact on existing features
101
+ - Use `rg`/targeted search terms or inspector agents before broad traversal
102
+ - Pass exact file and test findings downstream into `design.md` and task `Related Files`
@@ -2,12 +2,34 @@
2
2
 
3
3
  ## Purpose
4
4
 
5
- Provide tools and methods to gather necessary information before writing requirements and design. Prioritize breadth before depth.
5
+ Provide tools and methods to gather necessary information before writing requirements and design. The goal is evidence-backed decision-making: use the current codebase and current external knowledge before locking requirements, architecture, and tasks.
6
6
 
7
7
  ## Skip Conditions
8
8
 
9
- - Simple task, small scopeno research needed
10
- - User already provided research reports skip, use directly
9
+ - Simple one-file task with no integration point record skip rationale in `research.md`
10
+ - Internal text/docs-only change record skip rationale in `research.md`
11
+ - User already provided recent research reports → use directly, but record what was reused
12
+
13
+ Skipping research does NOT mean skipping the evidence trail. Every non-trivial spec still needs an Evidence Summary in `research.md`.
14
+
15
+ ## Evidence Gate Triggers
16
+
17
+ ### Targeted Codebase Scout — Mandatory When
18
+
19
+ - The spec changes existing behavior rather than creating an isolated new artifact.
20
+ - The spec touches API routes, CLI commands, package exports, database schemas, migrations, auth/session, permissions, runtime config, hooks, generated artifacts, or settings.
21
+ - Requirements or tasks cannot name exact affected files, modules, tests, or contracts.
22
+ - The change may invalidate existing `.test.*`, `.spec.*`, e2e, build, or integration checks.
23
+ - The spec is being resumed or validated after the codebase may have changed.
24
+ - The work crosses monorepo, package source, installed runtime, docs site, or publish/install boundaries.
25
+
26
+ ### External / Current Research — Mandatory When
27
+
28
+ - The spec depends on third-party APIs, libraries, SDKs, browser/platform policies, package manager behavior, cloud services, or external protocols.
29
+ - The spec touches security, auth, payment, privacy, delete-data, compliance, performance, accessibility, SEO, or current framework best practices.
30
+ - The spec involves AI providers, model behavior, agent tooling, browser automation, or fast-moving platform constraints.
31
+ - The user asks for "best", "optimal", "latest", "recommended", "current", "modern", or equivalent.
32
+ - Existing internal docs are stale, incomplete, or contradict package manifests/source code.
11
33
 
12
34
  ## 7 Research Tools
13
35
 
@@ -23,27 +45,50 @@ Provide tools and methods to gather necessary information before writing require
23
45
 
24
46
  ## Workflow
25
47
 
26
- ### 1. Identify What Needs Research
48
+ ### 1. Classify Evidence Needs
27
49
  Before detailing requirements, list unanswered questions:
28
50
  - Which technology is most suitable?
29
51
  - Is there an existing pattern/library that solves this?
30
52
  - How does the current codebase handle similar functionality?
31
53
  - Are there technical risks that need verification?
54
+ - What evidence is needed from the repository?
55
+ - What evidence requires current external sources?
56
+
57
+ ### 2. Run Targeted Codebase Scout
58
+ - Read project docs first, then verify claims against source files such as `package.json`, `go.mod`, schemas, routes, tests, and runtime config.
59
+ - Use inspector agents for large codebases or when multiple focused searches are needed.
60
+ - Save scout details to `reports/inspect-report.md` when inspector agents are used.
61
+ - Record the useful summary in `research.md`; do not dump raw search output.
32
62
 
33
- ### 2. Pick the Right Tool
63
+ ### 3. Run External / Current Research
64
+ - Prefer official documentation, standards, release notes, package repositories, or maintained upstream examples.
65
+ - Use broader web search only when primary sources do not answer the question.
66
+ - Record links and explain why each source matters.
67
+ - If sources conflict, state which source wins and why.
68
+
69
+ ### 4. Pick the Right Tool
34
70
  - Framework/API questions → Docs seeker
35
71
  - Current codebase questions → Inspector agents
36
72
  - Architecture/approach questions → Researcher agents
37
73
  - Complex multi-step reasoning → Sequential thinking
38
74
  - Historical decision questions → GitHub analysis
39
75
 
40
- ### 3. Spawn Researchers (when needed)
76
+ ### 5. Spawn Researchers (when needed)
41
77
  - Max 2 agents running in parallel
42
78
  - Each agent gets a specific aspect (e.g., agent 1 researches auth approach, agent 2 researches database schema)
43
79
  - Limit each agent to max 5 tool calls
44
80
  - Wait for all agents to complete before synthesizing
45
81
 
46
- ### 4. Record Findings
82
+ ### 6. Synthesize Decisions
83
+ - Convert raw findings into decisions before writing requirements:
84
+ - selected approach
85
+ - rejected alternatives
86
+ - codebase fit
87
+ - external/current constraints
88
+ - downstream task and test implications
89
+ - If evidence leaves multiple viable choices with no obvious winner, route to `/hapo:brainstorm` or ask the user for a decision.
90
+
91
+ ### 7. Record Findings
47
92
  - Write to `research.md` using template `templates/research.md`
48
93
  - Save researcher reports to `reports/researcher-{NN}.md`
49
94
  - Save inspector reports to `reports/inspect-report.md`
@@ -55,3 +100,5 @@ Before detailing requirements, list unanswered questions:
55
100
  - Identify multiple approaches for comparison
56
101
  - Consider edge cases during research
57
102
  - Flag security concerns early
103
+ - Do not design from memory when repository or current external evidence can settle the decision
104
+ - Keep external research concise: source, finding, implication, decision
@@ -17,6 +17,52 @@
17
17
  - Finding 2
18
18
  - Finding 3
19
19
 
20
+ ## Evidence Summary
21
+ This section is mandatory for non-trivial specs. It must be written before finalizing requirements, design, or tasks.
22
+
23
+ - **Codebase Scout**: Required / Skipped
24
+ - Result or skip rationale:
25
+ - Relevant files/modules:
26
+ - Existing patterns/contracts:
27
+ - Tests or checks affected:
28
+ - **External / Current Research**: Required / Skipped
29
+ - Result or skip rationale:
30
+ - Primary sources:
31
+ - Current constraints or best practices:
32
+ - **Selected Decision**:
33
+ - Decision:
34
+ - Why it fits the current codebase:
35
+ - Why it fits current external constraints:
36
+ - **Rejected Alternatives**:
37
+ - Alternative 1 — rejection reason
38
+ - Alternative 2 — rejection reason
39
+ - **Remaining Gaps / Questions**:
40
+ - Gap 1
41
+ - Gap 2
42
+ - **Downstream Task & Test Implications**:
43
+ - Task implication:
44
+ - Test/verification implication:
45
+
46
+ ## Codebase Scout
47
+ Capture only useful repo evidence, not raw file dumps.
48
+
49
+ | Area | Finding | Evidence / Path | Implication |
50
+ |------|---------|-----------------|-------------|
51
+ | Project surface | | | |
52
+ | Relevant files/modules | | | |
53
+ | Existing patterns | | | |
54
+ | Contracts | | | |
55
+ | Tests and verification | | | |
56
+ | Blast radius | | | |
57
+ | Staleness / conflicts | | | |
58
+
59
+ ## External / Current Research
60
+ Use official docs, standards, package repos, release notes, or maintained upstream references first.
61
+
62
+ | Question | Source | Finding | Decision Impact |
63
+ |----------|--------|---------|-----------------|
64
+ | | | | |
65
+
20
66
  ## Research Log
21
67
  Document notable investigation steps and their outcomes. Group entries by topic for readability.
22
68
 
@@ -75,7 +75,7 @@ See `references/execution-strategy.md` Phase C for full phase breakdown.
75
75
 
76
76
  Delegate execution to `test-runner` agent:
77
77
  ```
78
- Task(subagent_type="test-runner",
78
+ Agent(subagent_type="test-runner",
79
79
  prompt="Run tests. Scope: [blast-radius|full|ui]. Target: [path|url]. Return structured verdict.",
80
80
  description="Test [feature]")
81
81
  ```