qualia-framework 3.4.0 → 4.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (42) hide show
  1. package/README.md +96 -51
  2. package/agents/builder.md +25 -14
  3. package/agents/plan-checker.md +29 -16
  4. package/agents/planner.md +33 -24
  5. package/agents/research-synthesizer.md +25 -12
  6. package/agents/roadmapper.md +89 -84
  7. package/agents/verifier.md +11 -2
  8. package/bin/cli.js +13 -2
  9. package/bin/install.js +28 -5
  10. package/bin/qualia-ui.js +267 -1
  11. package/bin/state.js +377 -52
  12. package/bin/statusline.js +40 -20
  13. package/docs/erp-contract.md +23 -2
  14. package/guide.md +84 -21
  15. package/hooks/auto-update.js +54 -70
  16. package/hooks/branch-guard.js +64 -6
  17. package/hooks/migration-guard.js +85 -10
  18. package/hooks/pre-compact.js +28 -4
  19. package/hooks/pre-deploy-gate.js +46 -6
  20. package/hooks/pre-push.js +94 -27
  21. package/hooks/session-start.js +6 -0
  22. package/package.json +1 -1
  23. package/skills/qualia/SKILL.md +3 -1
  24. package/skills/qualia-build/SKILL.md +40 -5
  25. package/skills/qualia-handoff/SKILL.md +87 -12
  26. package/skills/qualia-idk/SKILL.md +155 -3
  27. package/skills/qualia-map/SKILL.md +4 -4
  28. package/skills/qualia-milestone/SKILL.md +122 -79
  29. package/skills/qualia-new/SKILL.md +151 -230
  30. package/skills/qualia-optimize/SKILL.md +4 -4
  31. package/skills/qualia-plan/SKILL.md +14 -9
  32. package/skills/qualia-quick/SKILL.md +1 -1
  33. package/skills/qualia-report/SKILL.md +12 -0
  34. package/skills/qualia-verify/SKILL.md +59 -5
  35. package/templates/help.html +98 -31
  36. package/templates/journey.md +113 -0
  37. package/templates/plan.md +56 -11
  38. package/templates/requirements.md +82 -22
  39. package/templates/roadmap.md +41 -14
  40. package/templates/tracking.json +12 -1
  41. package/tests/runner.js +560 -0
  42. package/tests/state.test.sh +40 -0
package/README.md CHANGED
@@ -1,10 +1,10 @@
1
- # Qualia Framework v3
1
+ # Qualia Framework v4
2
2
 
3
3
  A harness engineering framework for [Claude Code](https://claude.ai/code). It installs into `~/.claude/` and wraps your AI-assisted development workflow with structured planning, execution, verification, and deployment gates.
4
4
 
5
- It is not an application framework like Rails or Next.js. It doesn't generate code, run servers, or process data. It's an opinionated workflow layer that tells Claude how to plan, build, and verify your projects.
5
+ It is not an application framework like Rails or Next.js. It doesn't generate code, run servers, or process data. It's an opinionated workflow layer that tells Claude how to plan, build, and verify your projects — end-to-end, from "tell me what you want to make" to "here's the handoff doc for your client."
6
6
 
7
- v3 applies lessons from Anthropic's ["Harness Design for Long-Running Apps"](https://www.anthropic.com/engineering/harness-design-long-running-apps) article: scored evaluator rubrics, verification contracts, smarter guards, hook telemetry, and dynamic team management.
7
+ **v4 is the Full Journey release.** `/qualia-new` now maps the entire project arc from kickoff to client handoff upfront (all milestones, not just v1), and the Road can chain itself end-to-end in `--auto` mode with only two human gates per project. Story-file plan format, goal-backward verification, and the 4-dimension scoring rubric from v3 all carry forward.
8
8
 
9
9
  ## Install
10
10
 
@@ -26,68 +26,100 @@ npx qualia-framework traces # View recent hook telemetry
26
26
 
27
27
  ## Usage
28
28
 
29
- Open Claude Code in any project directory:
29
+ Open Claude Code in any project directory.
30
30
 
31
- ### The Road (main flow)
31
+ ### The Road — guided mode (default)
32
32
 
33
33
  ```
34
- /qualia-new # Set up a new project (deep questioning + research + roadmap)
35
- /qualia-plan N # Plan phase N (with plan-checker validation loop)
36
- /qualia-build N # Build phase N (wave-based parallel tasks)
37
- /qualia-verify N # Verify phase N works (goal-backward + QA browser)
34
+ /qualia-new # Set up a project: questioning + research + JOURNEY.md with all milestones → Handoff
35
+ /qualia-plan N # Plan phase N of the current milestone (story-file format, plan-checker validation loop)
36
+ /qualia-build N # Build phase N (builder subagents with pre-inlined context, wave-based parallel tasks)
37
+ /qualia-verify N # Verify phase N works (goal-backward + per-task acceptance criteria + browser QA)
38
38
  ...repeat plan/build/verify per phase...
39
- /qualia-polish # Design and UX pass
40
- /qualia-ship # Deploy to production
41
- /qualia-handoff # Deliver to client
39
+ /qualia-milestone # Close current milestone, open next (loads next scope from JOURNEY.md)
40
+ ...repeat per milestone until the final "Handoff" milestone...
41
+ /qualia-polish # Design and UX pass (first phase of the Handoff milestone)
42
+ /qualia-ship # Deploy to production
43
+ /qualia-handoff # Enforce the 4 mandatory handoff deliverables
44
+ /qualia-report # Mandatory end-of-session report + ERP upload
42
45
  ```
43
46
 
47
+ ### The Road — auto mode
48
+
49
+ ```
50
+ /qualia-new --auto
51
+ ```
52
+
53
+ Research runs automatically. User approves the full journey once. Framework chains plan → build → verify → (next phase) → ... → milestone boundary. User approves continuation per milestone. Framework resumes, eventually reaches the Handoff milestone's last phase → ship → handoff → report. Done.
54
+
55
+ Two human gates per project. One halt case (gap-cycle limit exceeded on a failing phase).
56
+
44
57
  ### Phase-specific depth (optional)
45
58
 
46
59
  ```
47
- /qualia-discuss N # Capture decisions before planning a complex phase
60
+ /qualia-discuss N # Capture decisions before planning a complex phase (locks constraints for the planner)
48
61
  /qualia-research N # Deep-research a niche phase (Context7/WebFetch/WebSearch)
49
- /qualia-map # Map existing codebase (brownfield projects)
50
- /qualia-milestone # Close current milestone, open next
62
+ /qualia-map # Map existing codebase (brownfield projects — run before /qualia-new)
51
63
  ```
52
64
 
53
65
  ### Navigation & state
54
66
 
55
67
  ```
56
- /qualia # What should I do next? (smart router)
57
- /qualia-idk # I'm stuck smart advisor
68
+ /qualia # Mechanical state router "what's my next command?"
69
+ /qualia-idk # Diagnostic — "what's actually going on?" Two isolated scans (planning / codebase), then a plain-language explanation
58
70
  /qualia-pause # Save session, continue later
59
71
  /qualia-resume # Pick up where you left off
60
72
  ```
61
73
 
62
- ### Quality & debug
74
+ ### Quality & shortcuts
63
75
 
64
76
  ```
65
77
  /qualia-debug # Structured debugging
66
78
  /qualia-design # One-shot design transformation
67
- /qualia-review # Production audit
68
- /qualia-optimize # Deep optimization pass
69
- /qualia-quick # Skip planning, just do it
70
- /qualia-task # Build one thing properly
79
+ /qualia-review # Production audit (scored diagnostics)
80
+ /qualia-optimize # Deep optimization pass (parallel specialist agents)
81
+ /qualia-quick # Fast path for trivial fixes (skips planning)
82
+ /qualia-task # Build one thing properly (fresh builder, atomic commit, no phase plan)
71
83
  /qualia-test # Generate or run tests
72
84
  ```
73
85
 
74
- ### Knowledge & reporting
86
+ ### Knowledge & meta
75
87
 
76
88
  ```
77
- /qualia-learn # Save a pattern, fix, or client pref
78
- /qualia-report # Log your work (mandatory end of day)
89
+ /qualia-learn # Save a pattern, fix, or client pref to ~/.claude/knowledge/
90
+ /qualia-skill-new # Author a new Qualia skill or agent
79
91
  /qualia-help # Open the framework reference in your browser
80
92
  ```
81
93
 
82
94
  See `guide.md` for the full developer guide.
83
95
 
84
- ## What's Inside (v3.3.0)
96
+ ## The Full Journey (v4)
97
+
98
+ Every v4 project has a `.planning/JOURNEY.md` — the North Star document that maps the entire arc from kickoff to client handoff.
99
+
100
+ ```
101
+ Project
102
+ └─ Journey (all milestones defined upfront)
103
+ └─ Milestone (a release — 2-5 total, Handoff is always last)
104
+ └─ Phase (a feature-sized deliverable, 2-5 tasks)
105
+ └─ Task (atomic unit, one commit, one verification contract)
106
+ ```
85
107
 
86
- - **26 skills** — slash commands from setup to handoff, plus debugging, design, review, knowledge, session management, skill authoring, and the new deep-flow additions (discuss, research, map, milestone)
87
- - **8 agents** planner, builder, verifier, qa-browser, researcher, research-synthesizer, roadmapper, plan-checker (each in fresh context)
88
- - **7 hooks** session start, branch guard, pre-push tracking sync, migration guard, deploy gate, pre-compact state save, auto-update (all Node.js — cross-platform)
89
- - **5 rules** security, frontend, design-reference, deployment, infrastructure
90
- - **12+ templates** project.md, plan.md, state.md, DESIGN.md, tracking.json, requirements.md, roadmap.md, phase-context.md, 4× research-project templates, 4× project-type templates
108
+ **Hard rules:**
109
+ - Hard floor: 2 milestones. Hard ceiling: 5.
110
+ - Final milestone is **always literally named "Handoff"** with 4 fixed phases (Polish, Content + SEO, Final QA, Handoff).
111
+ - Every non-Handoff milestone needs **≥ 2 phases** (enforced by `state.js close-milestone`).
112
+ - Milestone numbering is contiguous.
113
+
114
+ **Why it matters:** non-technical team members can follow the ladder from any entry point. `/qualia` and `/qualia-milestone` render JOURNEY.md as a visual ladder with current position highlighted.
115
+
116
+ ## What's Inside (v4.0.0)
117
+
118
+ - **26 skills** — from setup to handoff, plus debug, design, review, optimize, diagnostic (`qualia-idk`), session management, skill authoring, per-phase depth (discuss, research, map), and full-journey additions (`--auto` chaining, milestone closure)
119
+ - **8 agents** (each runs in fresh context): planner, builder, verifier, qa-browser, researcher, research-synthesizer, roadmapper, plan-checker
120
+ - **7 hooks** (pure Node.js, cross-platform): session-start, branch-guard, pre-push tracking sync, migration-guard, pre-deploy-gate, pre-compact state save, auto-update
121
+ - **5 rules**: security, frontend, design-reference, deployment, infrastructure
122
+ - **19 template files**: project.md, **journey.md** (new in v4), plan.md (story-file format), state.md, DESIGN.md, tracking.json (now with `milestone_name` + `milestones[]`), requirements.md (multi-milestone), roadmap.md (current milestone only), phase-context.md, 4 project-type templates (website, ai-agent, voice-agent, mobile-app), 5 research-project templates (STACK, FEATURES, ARCHITECTURE, PITFALLS, SUMMARY), help.html
91
123
  - **1 reference** — questioning.md methodology for deep project initialization
92
124
 
93
125
  ## Supported Platforms
@@ -100,35 +132,47 @@ Works on **Windows 10/11, macOS, and Linux**. Requires Node.js 18+ and Claude Co
100
132
 
101
133
  ## Why It Works
102
134
 
135
+ ### Full Journey (v4)
136
+
137
+ `/qualia-new` maps every milestone from kickoff to handoff. Team members see the entire ladder before climbing. No improvising the next chunk after each ship. The final milestone is always "Handoff" with 4 mandatory deliverables (verified production URL, updated docs, archived client assets, final ERP report) — so the path to "shipped" is visible from day 1.
138
+
139
+ ### Auto-Chain End-to-End
140
+
141
+ `--auto` mode chains `/qualia-plan → /qualia-build → /qualia-verify → …` without re-typing commands. The framework pauses only at real decisions: journey approval at kickoff, each milestone boundary, and one halt on gap-cycle-limit failures. Everything in between runs on rails.
142
+
103
143
  ### Goal-Backward Verification
104
144
 
105
- Most CI checks "did the task run." Qualia checks "does the outcome actually work." The verifier scores on 4 dimensions (Correctness, Completeness, Wiring, Quality), each 1-5, with a hard threshold at 3. It doesn't trust summaries — it greps the codebase for stubs, placeholders, unwired imports. The planner generates verification contracts (testable commands) that the verifier executes before ad-hoc checks.
145
+ Most CI checks "did the task run." Qualia checks "does the outcome actually work." The verifier scores on 4 dimensions (Correctness, Completeness, Wiring, Quality), each 15, with a hard threshold at 3. It doesn't trust summaries — it greps the codebase for stubs, placeholders, unwired imports, and walks each task's observable Acceptance Criteria.
146
+
147
+ ### Story-File Plans (Plans Are Prompts)
148
+
149
+ Plan files aren't documents that get translated into prompts — they ARE the prompts. Every task carries inline `Why` (rationale), `Acceptance Criteria` (observable user behaviors), `Depends on` (explicit ordering), and `Validation` (self-check commands) before the builder touches code. `@file` references tell the orchestrator what to pre-inline into the builder's prompt, saving 3-5 orientation Read calls per task.
106
150
 
107
151
  ### Agent Separation
108
152
 
109
- Splitting planner, builder, and verifier into separate agents with separate contexts prevents the "God prompt" problem where one massive context tries to plan AND code AND test. Each agent gets fresh context. This directly addresses Claude's quality degradation curve — task 50 gets the same quality as task 1.
153
+ Splitting planner, builder, and verifier into separate agents with separate contexts prevents the "God prompt" problem. Each agent gets fresh context. Task 50 gets the same quality as task 1.
110
154
 
111
155
  ### Production-Grade Hooks
112
156
 
113
- All 8 hooks are real ops engineering, not theoretical. Highlights:
157
+ All 7 hooks are real ops engineering, not theoretical:
114
158
 
115
159
  - **Pre-deploy gate** — TypeScript, lint, tests, build, and `service_role` leak scan before `vercel --prod`
116
- - **Branch guard** — Role-aware: owner can push to main, employees can't
117
- - **Migration guard** — Catches `DROP TABLE` without `IF EXISTS`, `DELETE` without `WHERE`, `CREATE TABLE` without RLS
118
- - **Env block** — Prevents Claude from touching `.env` files
160
+ - **Branch guard** — Role-aware: owner can push to main, employees can't (parses refspec so `feature/x:main` bypass is blocked)
161
+ - **Migration guard** — Catches `DROP TABLE` without `IF EXISTS`, `DELETE`/`UPDATE` without `WHERE`, `CREATE TABLE` without RLS, `GRANT ... TO PUBLIC`, `ALTER TABLE ... DROP COLUMN`
162
+ - **Pre-push** — Stamps tracking.json via a bot commit so the ERP always sees fresh data
119
163
  - **Pre-compact** — Saves state before context compression
120
164
 
121
165
  ### Enforced State Machine
122
166
 
123
- Every workflow step calls `state.js` — a Node.js state machine that validates preconditions (including plan content), updates both STATE.md and tracking.json atomically, and tracks gap-closure cycles. The gap-closure limit is configurable per project (default: 2). A `--force` flag enables recovery after failed builds.
167
+ Every workflow step calls `state.js` — a Node.js state machine that validates preconditions (including plan content), updates both STATE.md and tracking.json atomically, and tracks gap-closure cycles. v4 adds milestone readiness guards: `close-milestone` refuses to close a milestone with unverified phases or < 2 phases (unless `--force`), and appends a summary to `tracking.json.milestones[]` so the ERP renders a clean project tree.
124
168
 
125
169
  ### Wave-Based Parallelization
126
170
 
127
171
  Plans are grouped into waves for parallel execution. No fancy DAG solver — the planner assigns wave numbers, the orchestrator spawns agents per wave. Pragmatic over clever.
128
172
 
129
- ### Plans Are Prompts
173
+ ### Diagnostic Intelligence
130
174
 
131
- Plan files aren't documents that get translated into promptsthey ARE the prompts. `@file` references, explicit task actions, and verification criteria baked in. This eliminates translation loss between "what we planned" and "what Claude actually reads."
175
+ `/qualia-idk` is a real diagnostician (not a router alias). When the user's confusion is about *understanding the situation*, it spawns two isolated scans in parallel one reads only `.planning/`, the other reads only source code then synthesizes a plain-language "What I see / What I think is happening / What to do next" diagnosis. Catches plan↔code drift that a state-only router can't see.
132
176
 
133
177
  ## Architecture
134
178
 
@@ -137,23 +181,24 @@ npx qualia-framework install
137
181
  |
138
182
  v
139
183
  ~/.claude/
140
- ├── skills/ 19 slash commands
141
- ├── agents/ planner.md, builder.md, verifier.md, qa-browser.md
142
- ├── hooks/ 8 Node.js hooks — cross-platform (no bash dependency)
143
- ├── bin/ state.js (state machine) + qualia-ui.js (cosmetics library)
144
- ├── knowledge/ learned-patterns.md, common-fixes.md, client-prefs.md (loaded by plan/debug/new)
145
- ├── rules/ security.md, frontend.md, design-reference.md, deployment.md
146
- ├── qualia-templates/ tracking.json, state.md, project.md, plan.md, DESIGN.md
147
- ├── CLAUDE.md global instructions (role-configured per team member)
148
- └── statusline.js teal-branded 2-line status bar
184
+ ├── skills/ 26 slash commands
185
+ ├── agents/ 8 agent definitions (planner, builder, verifier, qa-browser, roadmapper, research-synthesizer, researcher, plan-checker)
186
+ ├── hooks/ 7 Node.js hooks — cross-platform (no bash dependency)
187
+ ├── bin/ state.js (state machine) + qualia-ui.js (cosmetics, banners, journey-tree) + statusline.js
188
+ ├── knowledge/ learned-patterns.md, common-fixes.md, client-prefs.md
189
+ ├── rules/ security, frontend, design-reference, deployment, infrastructure
190
+ ├── qualia-templates/ project.md, journey.md, plan.md (story-file), state.md, DESIGN.md, tracking.json, requirements.md, roadmap.md, + projects/*.md + research-project/*.md + help.html
191
+ ├── qualia-references/ questioning.md (deep project initialization methodology)
192
+ ├── CLAUDE.md global instructions (role-configured per team member)
193
+ └── (settings.json wired for hooks, statusline, spinner verbs, etc.)
149
194
  ```
150
195
 
151
196
  ## For Qualia Solutions Team
152
197
 
153
- Stack: Next.js 16+, React 19, TypeScript, Supabase, Vercel.
198
+ Stack: Next.js 16+, React 19, TypeScript, Supabase, Vercel. Voice: Retell AI, ElevenLabs, Telnyx. AI: OpenRouter. Compute: Railway (agents/background jobs).
154
199
 
155
200
  ## Changelog
156
201
 
157
- See [CHANGELOG.md](./CHANGELOG.md) for the full version history.
202
+ See [CHANGELOG.md](./CHANGELOG.md) for the full version history. v4.0.0 release notes are the most recent section.
158
203
 
159
204
  Built by [Qualia Solutions](https://qualiasolutions.net) — Nicosia, Cyprus.
package/agents/builder.md CHANGED
@@ -16,29 +16,40 @@ Working code + atomic git commit.
16
16
 
17
17
  ## How to Execute
18
18
 
19
- ### 1. Read Your Task
20
- Parse your task block:
21
- - **Files:** what to create or modify
22
- - **Action:** what to build
23
- - **Context:** read the `@file` references NOW before writing anything
24
- - **Done when:** the criterion you'll verify against
19
+ ### 1. Read Your Task (Story File)
20
+
21
+ Parse every field in your task block:
22
+ - **Wave / Depends on:** you should only be running when your dependencies are committed. If `Depends on: Task 1` and Task 1 isn't in git log, STOP and return `BLOCKED — waiting on Task N`.
23
+ - **Persona (optional):** if set to `security`, weight security rules heavily. If `ux`, prioritize accessibility + states. If `frontend`, read `.planning/DESIGN.md`. Acts as a lens, not a separate brain.
24
+ - **Files:** what to create or modify (scope boundary)
25
+ - **Why:** internalize this. It's the rationale. If you can't explain why this task matters in one sentence after reading, re-read before coding.
26
+ - **Acceptance Criteria:** the user-facing behaviors you must produce. You are done when these are true.
27
+ - **Action:** the concrete steps. Follow them.
28
+ - **Validation:** your self-check commands. Run these BEFORE `git commit`.
29
+ - **Context:** read every `@file` reference NOW before writing anything.
25
30
 
26
31
  ### 2. Read Before Write
27
32
  For every file you're about to modify — read it first. No exceptions.
28
- For every `@file` reference in your context — read it now.
33
+ For every `@file` reference in Context — read it now.
29
34
 
30
35
  ### 3. Build It
31
- - Follow the action exactly as specified
36
+ - Follow the Action exactly as specified
37
+ - Keep every Acceptance Criterion in mind — you are building toward observable user behaviors, not just files
32
38
  - MVP only — build what's asked, nothing extra
33
39
  - If the plan says "use library X" — use library X
34
40
  - If something in the plan seems wrong, flag it but still follow the plan
35
41
 
36
- ### 4. Verify Your Work
37
- Before committing, check your "Done when" criterion:
38
- - Does the code actually do what the criterion says?
39
- - Run `npx tsc --noEmit` if you touched TypeScript files
40
- - No `// TODO`, no placeholder text, no stub functions
41
- - Imports are wirednot just declared but actually used
42
+ ### 4. Self-Verify Your Work
43
+
44
+ Before committing:
45
+
46
+ 1. Run every command in **Validation:** they must pass
47
+ 2. Mentally walk through each **Acceptance Criterion** does the code actually produce that observable behavior?
48
+ 3. Run `npx tsc --noEmit` if you touched TypeScript files
49
+ 4. No `// TODO`, no placeholder text, no stub functions
50
+ 5. Imports are wired — not just declared but actually used
51
+
52
+ If any Validation command fails or any AC is not met, fix before committing. Do not commit and hope the verifier catches it.
42
53
 
43
54
  ### 5. Commit
44
55
  One atomic commit per task:
@@ -34,29 +34,42 @@ Plan must have YAML frontmatter with:
34
34
 
35
35
  **FAIL if:** frontmatter missing, incomplete, or `goal` differs from ROADMAP.md.
36
36
 
37
- ### Rule 2: Every task has the 3 mandatory fields
37
+ ### Rule 2: Every task has the 6 mandatory story-file fields
38
38
 
39
- Each `## Task N — title` block must include:
40
- - **Files:** specific absolute paths (not "the auth files", not "relevant components")
41
- - **Action:** concrete instructions (not "implement auth", not "add the feature")
42
- - **Done when:** testable criterion (not "auth works", not "it's done")
39
+ Each `## Task N — title` block must include ALL of these:
43
40
 
44
- **FAIL if:** any task missing any of the 3 fields, OR any field is vague.
41
+ - **Wave:** integer (e.g. `**Wave:** 1`)
42
+ - **Files:** specific absolute paths (not "the auth files", not "relevant components")
43
+ - **Depends on:** explicit task numbers OR `none` (not blank)
44
+ - **Why:** one-sentence rationale — what problem this solves (not "implement X")
45
+ - **Acceptance Criteria:** 2-4 observable user-facing behaviors as bullet points
46
+ - **Action:** concrete instructions with specific functions/imports/patterns
47
+ - **Validation:** 1-3 grep/curl/tsc commands the builder runs before committing
45
48
 
46
- **How to detect vague:**
47
- - `Files: {filenames}` → pass
48
- - `Files: relevant files` → fail
49
- - `Action: Build the login page using Supabase auth with email/password, validate with Zod, redirect to /dashboard` → pass
50
- - `Action: Implement authentication` → fail
51
- - `Done when: grep -c "signInWithPassword" src/lib/auth.ts returns non-zero` → pass
52
- - `Done when: auth works` → fail
49
+ `**Persona:**` is optional — warn if present but not one of {security, architect, ux, frontend, backend, performance, none}.
53
50
 
54
- ### Rule 3: Wave assignments are correct
51
+ **FAIL if:** any task missing any of the 7 required fields, OR any field is vague.
55
52
 
56
- Each task has a `**Wave:** {N}` field. Waves group tasks for parallel execution.
53
+ **How to detect vague:**
54
+ - `Files: relevant files` → FAIL
55
+ - `Files: src/lib/auth.ts, src/app/login/page.tsx` → PASS
56
+ - `Why: implement authentication` → FAIL (that's a what, not a why)
57
+ - `Why: Session persistence is the #1 abandonment trigger in the onboarding funnel` → PASS
58
+ - `Acceptance Criteria: - auth works` → FAIL (not observable)
59
+ - `Acceptance Criteria: - User signs up with email, sees verification prompt, clicks link, lands on /dashboard with session` → PASS
60
+ - `Action: Implement auth` → FAIL
61
+ - `Action: Add signInWithPassword() call in handleSubmit, validate with Zod, redirect to /dashboard on success` → PASS
62
+ - `Validation: it should work` → FAIL
63
+ - `Validation: grep -c "signInWithPassword" src/lib/auth.ts → ≥ 1` → PASS
64
+ - `Depends on:` (blank) → FAIL — must be explicit `none` or `Task N`
65
+
66
+ ### Rule 3: Wave assignments are correct and consistent with Depends on
67
+
68
+ Each task has a `**Wave:** {N}` field. Waves group tasks for parallel execution. The wave number must be consistent with the task's `**Depends on:**` line.
57
69
 
58
70
  **FAIL if:**
59
- - Task in Wave 2 doesn't reference a Wave 1 task as a dependency
71
+ - Task in Wave 2+ has `Depends on: none` (contradicts wave ordering should be Wave 1)
72
+ - Task in Wave N has a dependency on a task in Wave ≥N (impossible — dep must be in an earlier wave)
60
73
  - Tasks in same wave touch the same files (file conflict — can't run in parallel)
61
74
  - More than 3 waves (tasks too granular)
62
75
 
package/agents/planner.md CHANGED
@@ -34,7 +34,11 @@ Each truth → one task. 2-5 tasks per phase. Each task must fit in one context
34
34
  - **Wave 2:** Tasks that depend on Wave 1 (run after Wave 1 completes)
35
35
  - Most phases need 1-2 waves. If you need 3+, your tasks are too granular.
36
36
 
37
- ### 4. Write the Plan
37
+ ### 4. Write the Plan (Story-File Format)
38
+
39
+ Plans are STORY FILES, not task lists. Every task is a self-contained package that embeds *why*, *what*, and *how to verify* — so the builder can execute without re-reading PRDs and the verifier has explicit acceptance targets.
40
+
41
+ Use `~/.claude/qualia-templates/plan.md` as the structural reference. Every task block MUST include: **Wave, Files, Depends on, Why, Acceptance Criteria, Action, Validation, Context.** Persona is optional.
38
42
 
39
43
  ```markdown
40
44
  ---
@@ -46,40 +50,45 @@ waves: {count}
46
50
 
47
51
  # Phase {N}: {Name}
48
52
 
49
- Goal: {what must be true when done}
53
+ **Goal:** {what must be TRUE when this phase is done}
54
+ **Why this phase:** {one sentence — what this unlocks}
50
55
 
51
56
  ## Task 1 — {title}
52
57
  **Wave:** 1
53
- **Files:** {files to create or modify}
54
- **Action:** {exactly what to build — specific enough for a junior dev to follow}
55
- **Context:** Read @{file references the builder needs}
56
- **Done when:** {observable, testable criterion}
58
+ **Persona:** {optional: security | architect | ux | frontend | backend | performance | none}
59
+ **Files:** {specific paths}
60
+ **Depends on:** {none | Task N}
57
61
 
58
- ## Task 2{title}
59
- **Wave:** 1
60
- **Files:** {files}
61
- **Action:** {what to build}
62
- **Done when:** {criterion}
62
+ **Why:** {one-sentence rationalewhat problem this solves}
63
+
64
+ **Acceptance Criteria:**
65
+ - {observable user-facing behavior 1}
66
+ - {observable user-facing behavior 2}
63
67
 
64
- ## Task 3 {title}
65
- **Wave:** 2 (after Task 1, 2)
66
- **Files:** {files}
67
- **Action:** {what to build}
68
- **Done when:** {criterion}
68
+ **Action:** {concrete steps with function names, imports, patterns}
69
+
70
+ **Validation:** (builder self-check)
71
+ - `{exact command}` → expected output
72
+
73
+ **Context:** Read @{file references}
69
74
 
70
75
  ## Success Criteria
71
- - [ ] {truth 1 — what the user can do}
72
- - [ ] {truth 2}
73
- - [ ] {truth 3}
76
+ - [ ] {phase-level truth 1}
77
+ - [ ] {phase-level truth 2}
74
78
  ```
75
79
 
76
80
  ## Task Specificity (Mandatory)
77
81
 
78
- Every task MUST have these three fields with concrete content:
82
+ Every task MUST have these fields with concrete content:
83
+
84
+ - **Files:** Absolute paths from project root. Not "the auth files" or "relevant components". Specific: `src/app/auth/login/page.tsx`, `src/lib/auth.ts`. If creating, state what it exports. If modifying, state what changes.
85
+ - **Depends on:** Explicit task numbers this task requires, OR `none`. This is what enables wave assignment and parallel-safe execution. Do not leave it blank.
86
+ - **Why:** One sentence explaining the *motivation* — what problem this solves, what would break without it. Not "implement auth" but "Session persistence is the #1 abandonment trigger; verification emails are wasted without it."
87
+ - **Acceptance Criteria:** 2-4 bullet points describing what the user can observe when this task is done. Not "auth works" but "User signs up, receives verification email, clicks link, lands on /dashboard with session persisted across refresh."
88
+ - **Action:** At least one concrete instruction. Reference specific functions, components, patterns: "Add `signInWithPassword()` call in the `handleSubmit` handler, validate email with Zod schema, redirect to `/dashboard` on success."
89
+ - **Validation:** 1-3 grep/curl/tsc commands the builder runs BEFORE committing. These are the builder's self-check — they prove the task actually produced running code, not just files.
79
90
 
80
- - **Files:** Absolute paths from project root. Not "the auth files" or "relevant components". Specific: `src/app/auth/login/page.tsx`, `src/lib/auth.ts`. If creating a file, state what it exports. If modifying, state what changes.
81
- - **Action:** At least one concrete instruction — not just "implement auth". Reference specific functions, components, or patterns. "Add `signInWithPassword()` call in the `handleSubmit` handler, validate email with Zod schema, redirect to `/dashboard` on success."
82
- - **Done when:** Testable, not fuzzy. Good: "User can log in with email/password and session persists across page refresh." Bad: "Auth works." Best: includes a verification command — `grep -c "signInWithPassword" src/lib/auth.ts` returns non-zero.
91
+ **Persona (optional):** If a task has a clear specialist lens (security, architect, ux, frontend, backend, performance), set `**Persona:**` so the builder weights relevant rules. Leave blank or set `none` if generic.
83
92
 
84
93
  If a task involves a library, framework, or API you're unsure about, fetch the current documentation BEFORE specifying the approach. Don't guess at APIs.
85
94
 
@@ -89,7 +98,7 @@ Preferred order:
89
98
 
90
99
  Your training data is often stale. A two-second lookup is cheaper than a wrong task specification.
91
100
 
92
- **Self-check:** Before returning the plan, verify every task has specific file paths, concrete actions, and testable done-when criteria. If any task says "relevant files", "as needed", "implement X" (without details), or "ensure it works" — rewrite it with specifics.
101
+ **Self-check:** Before returning the plan, verify every task has: specific file paths, an explicit Depends on line, a one-sentence Why, 2-4 Acceptance Criteria, concrete Action, and 1-3 Validation commands. If any field says "relevant files", "as needed", "implement X" (without details), or "ensure it works" — rewrite it with specifics. If you can't write a Why, the task is probably not needed.
93
102
 
94
103
  ## Verification Contracts
95
104
 
@@ -44,17 +44,30 @@ Write for someone who will only read this section.
44
44
 
45
45
  Don't duplicate full documents. Summarize the 3-5 most important items from each dimension. Link back to the detail docs for readers who want more.
46
46
 
47
- ### 4. Derive Roadmap Implications
47
+ ### 4. Derive Journey Implications (Multi-Milestone)
48
48
 
49
- This is the most important section. Based on:
50
- - FEATURES.md MVP definition → what v1 must have
51
- - ARCHITECTURE.md build order → what depends on what
52
- - PITFALLS.md phase mapping → what each phase must prevent
49
+ This is the most important section. Suggest the **full milestone arc**, not just a v1 phase list.
53
50
 
54
- Suggest a phase structure. Be explicit about:
55
- - **What each phase delivers** (user-facing capability)
56
- - **Why this order** (dependencies or risk-first reasoning)
57
- - **Research flags** phases likely needing deeper research during `/qualia-plan`
51
+ Based on:
52
+ - FEATURES.md split (table stakes = v1 across milestones, differentiators = later milestones or post-handoff)
53
+ - ARCHITECTURE.md build order what depends on what, which foundation must land in Milestone 1 to support final-milestone requirements
54
+ - PITFALLS.md which risks stall later milestones and need to be addressed in Milestone 1 foundations
55
+
56
+ Suggest a **2-5 milestone arc ending in Handoff**:
57
+
58
+ - **Milestone 1 · Foundation** — almost always. DB, auth, base layout, deploy pipeline.
59
+ - **Milestone 2-{N-1} · Core + Expansion** — the value-delivering capabilities, ordered by dependency.
60
+ - **Milestone {N} · Handoff** — ALWAYS the final milestone. Fixed 4 phases: Polish, Content + SEO, Final QA, Handoff.
61
+
62
+ For each milestone, say:
63
+ - **Name** — short, evocative
64
+ - **Why now** — one plain-language sentence explaining why this follows the previous
65
+ - **Exit criteria** — 2-3 observable outcomes
66
+ - **Phases sketched** — 2-5 phase names with one-line goals (M1 full detail, M2..M{N-1} sketched)
67
+
68
+ Also suggest:
69
+ - **Research flags** — which milestones likely need deeper research during `/qualia-plan` (the roadmapper may schedule `/qualia-research {N}` for these)
70
+ - **Handoff implications** — what the client needs to take over (credentials, docs, training) — informs the Handoff milestone's scope
58
71
 
59
72
  ### 5. Set Overall Confidence
60
73
 
@@ -79,8 +92,8 @@ Note gaps: areas where research was inconclusive. These will be addressed during
79
92
  ```
80
93
  Wrote: .planning/research/SUMMARY.md
81
94
  Overall confidence: {HIGH/MEDIUM/LOW}
82
- Suggested phases: {count}
83
- Research flags: {count} (phases needing deeper research during planning)
95
+ Suggested milestones: {count including Handoff}
96
+ Research flags: {count} (milestones needing deeper research during planning)
84
97
  ```
85
98
 
86
- The roadmapper agent reads your SUMMARY.md as context when producing REQUIREMENTS.md and ROADMAP.md.
99
+ The roadmapper agent reads your SUMMARY.md as context when producing JOURNEY.md, REQUIREMENTS.md, and ROADMAP.md (Milestone 1 detail).