qualia-framework 6.14.0 → 6.22.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (50) hide show
  1. package/AGENTS.md +8 -5
  2. package/CHANGELOG.md +130 -0
  3. package/CLAUDE.md +3 -1
  4. package/agents/roadmapper.md +16 -14
  5. package/bin/agent-status.js +24 -11
  6. package/bin/branch-hygiene.js +135 -0
  7. package/bin/command-surface.js +1 -0
  8. package/bin/compile-instructions.js +82 -0
  9. package/bin/eval-runner.js +218 -0
  10. package/bin/host-adapters.js +72 -12
  11. package/bin/install.js +21 -13
  12. package/bin/last-report.js +207 -0
  13. package/bin/project-sync.js +315 -0
  14. package/bin/runtime-manifest.js +6 -0
  15. package/bin/state.js +112 -1
  16. package/bin/verify-panel.js +294 -0
  17. package/bin/wave-plan.js +211 -0
  18. package/docs/erp-contract.md +145 -0
  19. package/package.json +3 -2
  20. package/rules/codex-goal.md +28 -26
  21. package/rules/infrastructure.md +1 -1
  22. package/skills/qualia/SKILL.md +6 -0
  23. package/skills/qualia-build/SKILL.md +12 -9
  24. package/skills/qualia-eval/SKILL.md +83 -0
  25. package/skills/qualia-feature/SKILL.md +20 -4
  26. package/skills/qualia-fix/SKILL.md +13 -1
  27. package/skills/qualia-milestone/SKILL.md +12 -6
  28. package/skills/qualia-new/REFERENCE.md +6 -4
  29. package/skills/qualia-new/SKILL.md +27 -15
  30. package/skills/qualia-plan/SKILL.md +2 -2
  31. package/skills/qualia-report/SKILL.md +10 -0
  32. package/skills/qualia-scope/SKILL.md +3 -3
  33. package/skills/qualia-ship/SKILL.md +34 -4
  34. package/skills/qualia-update/SKILL.md +4 -0
  35. package/skills/qualia-verify/SKILL.md +45 -24
  36. package/templates/instructions.md +32 -0
  37. package/templates/journey.md +1 -1
  38. package/templates/project-discovery.md +30 -23
  39. package/templates/requirements.md +7 -7
  40. package/tests/agent-status.test.sh +15 -0
  41. package/tests/branch-hygiene.test.sh +93 -0
  42. package/tests/eval-runner.test.sh +147 -0
  43. package/tests/instructions.test.sh +109 -0
  44. package/tests/last-report.test.sh +156 -0
  45. package/tests/lib.test.sh +2 -2
  46. package/tests/project-sync.test.sh +175 -0
  47. package/tests/run-all.sh +7 -0
  48. package/tests/state.test.sh +92 -0
  49. package/tests/verify-panel.test.sh +162 -0
  50. package/tests/wave-plan.test.sh +153 -0
@@ -1,46 +1,48 @@
1
- # Codex /goal integration
1
+ # Work-unit goal (both runtimes)
2
2
 
3
- When this skill spawns a unit of work on **Codex** (not Claude Code), set the thread goal at the start so Codex's native token-budget + status tracking takes over.
3
+ When a skill begins a defined **unit of work** (a phase build, a feature, a milestone, a fix), set an explicit goal an objective + a token budget — so the session tracks burn-vs-budget and stays anchored to one outcome. Both runtimes get this; the *surface* differs.
4
4
 
5
- ## Runtime detection
6
-
7
- You are on Codex when `~/.codex/` exists and `~/.claude/` is absent or stale. The simplest probe:
5
+ The objective + budget come from one shared helper, regardless of runtime:
8
6
 
9
7
  ```bash
10
- test -f ~/.codex/AGENTS.md && echo codex || echo claude
8
+ node ${QUALIA_BIN}/codex-goal.js {scope} # scope phase · task · feature · quick
11
9
  ```
12
10
 
13
- If the answer is `claude`, **skip this entire rule** — Claude Code has no equivalent surface and emitting `/goal` text would just be noise.
11
+ It prints two lines from `.planning/STATE.md` + `ROADMAP.md`:
12
+
13
+ ```
14
+ /goal {objective text}
15
+ # token_budget suggestion: {N}
16
+ ```
14
17
 
15
- ## How to set the goal
18
+ ## Runtime detection
19
+
20
+ ```bash
21
+ test -f ~/.codex/AGENTS.md && [ ! -d ~/.claude ] && echo codex || echo claude
22
+ ```
16
23
 
17
- 1. Run the helper to produce the objective string + suggested token budget:
24
+ ## Codex native `/goal`
18
25
 
19
- ```bash
20
- node ~/.codex/bin/codex-goal.js {scope}
21
- ```
26
+ Codex has a first-class goal surface (`thread_goals`: objective, token_budget, tokens_used, status).
22
27
 
23
- `{scope}` is one of: `phase` · `task` · `feature` · `quick`. Use the scope of the current skill.
28
+ 1. **If the `update_goal` tool is available** (Codex exposes it as a model-callable tool), call it with `objective` = the text after `/goal ` and `token_budget` = the integer suggestion.
29
+ 2. **Otherwise** surface the `/goal` line for the user to paste. Don't silently skip — it's a one-second set and the only way Codex's budget telemetry knows what to track.
24
30
 
25
- 2. The output is two lines:
31
+ ## Claude Code equivalent via the harness work-list + budget
26
32
 
27
- ```
28
- /goal {objective text from STATE.md + ROADMAP.md}
29
- # token_budget suggestion: {N}
30
- ```
33
+ Claude Code has no `/goal` table, but it has a native equivalent: the **session task-list** (the model's todo/task tool) and the turn **token budget**. Use them so the work unit is just as anchored and visible:
31
34
 
32
- 3. **If the `update_goal` tool is available** to you (Codex exposes it as a model-callable tool), call it directly with:
33
- - `objective` = the text after `/goal ` on line 1
34
- - `token_budget` = the integer suggestion on line 2
35
+ 1. **Create a tracked task** for the unit with the objective as its title (e.g. *"Phase 3 checkout + Stripe webhook"*). Mark it `in_progress` at start, `completed` at end. This is the Claude-side "active goal" — it shows in the UI and survives compaction.
36
+ 2. **Treat `token_budget` as the unit's context budget.** State it in the opening line (banner) — *"Goal: {objective} · budget ~{N} tok"* — so the operator and the model both see how much room the unit has. If a `+Nk` turn directive is set, prefer that.
37
+ 3. For a multi-wave phase, the per-task `.agent-status/` entries (see `/qualia-build`) are the sub-goals under this one.
35
38
 
36
- 4. **If `update_goal` is not available**, surface the `/goal` line to the user in your next message and let them paste it. Do not silently skip — the goal-set takes 1 second and is the only way Codex's budget telemetry knows what to track.
39
+ Either way the rule is the same: **one named objective + one budget per work unit, surfaced, not silent.**
37
40
 
38
41
  ## When NOT to set a goal
39
42
 
40
- - The user is on Claude Code (no `/goal` surface).
41
- - A goal is already active for this thread (Codex rejects `update_goal` when one exists call `thread/goal/get` first if you're using the tool API directly).
42
- - The work is open-ended exploration with no clear objective (e.g. `/qualia`, `/qualia-scope`). Goals are for executing a defined scope.
43
+ - A goal/task is already active for this unit (don't double-set; Codex rejects `update_goal` when one exists — check first).
44
+ - Open-ended exploration with no defined scope (`/qualia`, `/qualia-scope` PROJECT MODE, `/qualia-idk`). Goals are for *executing* a defined scope, not discovering one.
43
45
 
44
46
  ## Why
45
47
 
46
- Codex's `thread_goals` table tracks `objective`, `token_budget`, `tokens_used`, and a `status` enum (`active | paused | blocked | usage_limited | budget_limited | complete`). Setting the goal lets the user see burn-vs-budget in the TUI without the framework reinventing it. The token-budget number also makes the model self-aware of how much context it has left for the current unit of work.
48
+ A named objective + budget keeps a unit of work from sprawling: the model stays self-aware of how much context remains, the operator sees burn-vs-budget, and the unit has a single definition of done. On Codex this rides `thread_goals`; on Claude Code it rides the task-list + turn budget. Same discipline, native surface on each.
@@ -49,7 +49,7 @@ Standard services across all Qualia projects. Use these unless the project expli
49
49
  - **QualiasolutionsCY** — primary org for all Qualia Solutions projects
50
50
  - **SakaniQualia** — org for Sakani-related projects (real estate platform)
51
51
  - All repos are private by default
52
- - Branch protection: main/master require PR reviews (enforced by framework guards)
52
+ - Main integration: feature branches integrate to `main` at **`/qualia-ship`** (ship is the single merge point — it fast-forwards the branch into `main`, deploys from `main`, and deletes the branch). Pushes to `main` are **allowed and recorded** by `branch-guard` (per-employee tally → ERP) — accountability, not a hard block. `/qualia-report` sweeps for branches with unshipped commits + stale PRs at clock-out so nothing lingers. Keep GitHub branch protection on `main` OFF (or with the team allowed to push) for this model; if you re-enable required reviews, switch ship to an auto-merged PR instead.
53
53
 
54
54
  ## Vercel Teams (admin knowledge)
55
55
  - Qualia operates across **3 Vercel teams** — projects are distributed across them
@@ -33,6 +33,12 @@ ls .planning/phase-*-plan.md 2>/dev/null || echo "NO_PLANS"
33
33
  ls .planning/phase-*-verification.md 2>/dev/null || echo "NO_VERIFICATIONS"
34
34
  ```
35
35
 
36
+ And surface where work was left off last time — the richest "where we left off" signal lives in `.planning/reports/`:
37
+ ```bash
38
+ node ${QUALIA_BIN}/last-report.js 2>/dev/null
39
+ ```
40
+ Exit 0 → it prints a one-line digest of the newest session report (`Last session ({date}, {age}d ago): {summary} → next: {next}`). Exit 1 → no reports yet (nothing to surface). When a project is loaded and a digest exists, print that line **at the very TOP of your output**, before the banner — so the first thing the operator (or a teammate picking the project up) sees is exactly where the last session ended.
41
+
36
42
  Read conversation context — what has the user been doing, what errors occurred.
37
43
 
38
44
  ### 2. Classify and Route
@@ -21,12 +21,13 @@ Execute phase plan. Each task = fresh subagent. Independent tasks run parallel.
21
21
  `/qualia-build` — build current planned phase
22
22
  `/qualia-build {N}` — build specific phase
23
23
  `/qualia-build {N} --auto` — build + chain into `/qualia-verify {N} --auto` (no human gate)
24
+ `/qualia-build {N} --parallel K` — cap concurrent builders at K (default auto: sequential under 3 tasks, else up to 5)
24
25
 
25
26
  ## Process
26
27
 
27
- ### 0. Codex goal (Codex runtime only)
28
+ ### 0. Set the work-unit goal
28
29
 
29
- Per `rules/codex-goal.md` — set the thread goal at phase start with scope `phase`.
30
+ Per `rules/codex-goal.md` — set the work-unit goal at phase start with scope `phase` (Codex `/goal`; on Claude Code, a tracked task + budget in the banner). One named objective + budget for the whole build.
30
31
 
31
32
  ### 1. Load Plan
32
33
 
@@ -76,13 +77,15 @@ git diff --stat
76
77
  node ${QUALIA_BIN}/qualia-ui.js banner build {N} "{phase name}"
77
78
  ```
78
79
 
79
- **For each wave (sequential):**
80
+ **Derive the build schedule from the dependency graph (don't trust hand-numbered waves, don't over-spawn):**
80
81
 
81
82
  ```bash
82
- node ${QUALIA_BIN}/qualia-ui.js wave {W} {total_waves} {tasks_in_wave}
83
+ node ${QUALIA_BIN}/wave-plan.js .planning/phase-{N}-contract.json {--parallel K if set} --json
83
84
  ```
84
85
 
85
- **Per task in wave: spawn ALL as separate `Agent()` calls in SAME turn (concurrent). Do NOT await one before spawning next.**
86
+ `wave-plan.js` recomputes minimal-depth waves from `depends_on` (maximal safe parallelism) and splits each into **batches capped at `max_concurrency`** (auto: 1 if <3 tasks, else 5; `--parallel K` overrides). Spawn **one batch at a time, in order** every task in a batch is dependency-free of its batch-mates, so they run concurrently; the next batch waits for the fan-in barrier (§ after each wave). Follow the emitted `batches[]`, not the raw contract `wave` numbers.
87
+
88
+ **Per batch: spawn ALL its tasks as separate `Agent()` calls in the SAME turn (concurrent). Do NOT await one before spawning the next.**
86
89
 
87
90
  ```bash
88
91
  node ${QUALIA_BIN}/qualia-ui.js task {task_num} "{task title}"
@@ -150,15 +153,15 @@ Execute. Commit. Write your DONE/BLOCKED/PARTIAL status. Return DONE/BLOCKED/PAR
150
153
  node ${QUALIA_BIN}/qualia-ui.js done {task_num} "{title}" {commit_hash}
151
154
  ```
152
155
 
153
- **After each wave — fan-in barrier (deterministic, not "did the model notice"):**
156
+ **After each batch — fan-in barrier (deterministic, not "did the model notice"):**
154
157
 
155
158
  ```bash
156
- node ${QUALIA_BIN}/agent-status.js barrier .planning/phase-{N}-contract.json --wave {W}
159
+ node ${QUALIA_BIN}/agent-status.js barrier --tasks {comma-separated task ids in this batch}
157
160
  ```
158
161
 
159
- Exit 0 ⇔ every task in the wave wrote `DONE`. Non-zero → the barrier lists which tasks are RUNNING/BLOCKED/PARTIAL/MISSING. Do NOT advance to the next wave until the barrier passes; a BLOCKED/PARTIAL task is a wave failure (§4). `agent-status.js list` shows the live wave view.
162
+ Exit 0 ⇔ every task in the batch wrote `DONE`. Non-zero → the barrier lists which tasks are RUNNING/BLOCKED/PARTIAL/MISSING. Do NOT spawn the next batch until the barrier passes; a BLOCKED/PARTIAL task is a wave failure (§4). `agent-status.js list` shows the live view. (Gating per batch — not per contract wave — keeps the barrier aligned with the `wave-plan.js` schedule, whose derived waves needn't match the contract's declared wave numbers.)
160
163
 
161
- **After each wave:** move to next, show summary.
164
+ **After each batch:** move to the next batch in the schedule, show summary.
162
165
 
163
166
  ### 3. Wave Completion
164
167
 
@@ -0,0 +1,83 @@
1
+ ---
2
+ name: qualia-eval
3
+ description: "Evaluate an AI feature (chat / RAG / voice / agent) against a layered eval suite — deterministic assertions first, then llm-rubric judges — and gate on the result. Qualia gates UI and code; this is the equivalent gate for the AI artifacts a project builds. Triggers: 'eval this agent', 'test the chatbot', 'evaluate the AI feature', 'rag eval', 'does the assistant answer correctly', 'judge the model output', 'qualia-eval'."
4
+ allowed-tools:
5
+ - Bash
6
+ - Read
7
+ - Write
8
+ - Edit
9
+ - Grep
10
+ - Glob
11
+ - Agent
12
+ ---
13
+
14
+ # /qualia-eval — Evaluate an AI Feature
15
+
16
+ `contract-runner` proves the code exists; `verify-panel` proves the code is correct. Neither can tell you whether the **chatbot actually answers the refund question**. This lane closes that gap with a layered eval suite — cheap deterministic checks first, model judgment only where a model is required — mirroring the contract-runner evidence model.
17
+
18
+ ## Usage
19
+ `/qualia-eval {suite.json}` — run an eval suite for one AI feature
20
+ `/qualia-eval {N}` — run every `.planning/evals/*-suite.json` for phase N (verify-step gate)
21
+
22
+ ## The suite (JSON)
23
+
24
+ One suite per AI feature. Each case carries a captured `output` (or `output_file`) plus optional `latency_ms` / `cost_usd`, and a list of assertions:
25
+
26
+ ```json
27
+ {
28
+ "feature": "support-chat",
29
+ "cases": [
30
+ { "name": "refund window", "input": "what's your refund policy?",
31
+ "output": "We refund within 30 days of purchase.",
32
+ "latency_ms": 1200, "cost_usd": 0.008,
33
+ "assert": [
34
+ { "type": "contains", "value": "30 days" },
35
+ { "type": "not_contains", "value": "I cannot help" },
36
+ { "type": "max_latency_ms", "value": 2000 },
37
+ { "type": "llm_rubric", "rubric": "answer is grounded in the policy, no hallucinated terms" }
38
+ ] } ]
39
+ }
40
+ ```
41
+
42
+ Deterministic assertion types (settled with no model): `contains`, `not_contains`, `equals`, `regex`, `not_regex`, `min_length`, `max_length`, `json_valid`, `json_path` (`equals`/`contains`), `max_latency_ms`, `max_cost_usd`. The model-only type is `llm_rubric`.
43
+
44
+ ## Process
45
+
46
+ ### 1. Capture outputs
47
+
48
+ For each case, run the AI feature on `input` and record the real `output` (+ `latency_ms`/`cost_usd` if measurable) back into the suite. Use the project's own entrypoint — an API route, a script, or the agent SDK. If outputs are already captured (replay fixtures), skip to step 2.
49
+
50
+ ### 2. Judge the rubrics (one judge per llm_rubric, fresh context)
51
+
52
+ Deterministic assertions need no model — `eval-runner.js` settles them. For each `llm_rubric` assertion, spawn a judge to return a verdict, then write `"verdict": "pass"|"fail"` onto that assertion in the suite. This mirrors how `verify-panel` consumes skeptic votes: the model judges, the runner aggregates.
53
+
54
+ ```
55
+ Agent(prompt="
56
+ Role: @${QUALIA_AGENTS}/verifier.md
57
+
58
+ JUDGE one rubric against one output. No code to grep — judge the text only.
59
+ Rubric: {rubric}
60
+ Input: {input}
61
+ Output to judge: {output}
62
+
63
+ Return exactly one line: PASS — {reason} OR FAIL — {reason}. Default FAIL if the output does not clearly satisfy the rubric.
64
+ ", subagent_type="qualia-verifier", description="Judge rubric — {case name}")
65
+ ```
66
+
67
+ An `llm_rubric` with no verdict is PENDING and **fails** the suite — never silently pass an unjudged rubric.
68
+
69
+ ### 3. Run the deterministic verdict
70
+
71
+ ```bash
72
+ node ${QUALIA_BIN}/eval-runner.js {suite.json} --write
73
+ ```
74
+
75
+ `eval-runner.js` runs every deterministic assertion itself, folds in the rubric verdicts, and exits **0 = all cases pass / 1 = any failure or unjudged rubric**. Artifact: `.planning/evals/eval-{feature}.json`.
76
+
77
+ ### 4. Gate
78
+
79
+ Exit 0 → the AI feature meets its bar; report PASS with the per-case summary. Exit 1 → list the failing cases + assertions and route to `/qualia-fix` (behavior wrong) or back to the prompt/RAG config. When run as a phase verify-step gate (`/qualia-eval {N}`), a FAIL is a phase FAIL — same standing as a failing contract.
80
+
81
+ ```bash
82
+ node ${QUALIA_BIN}/qualia-ui.js end "EVAL COMPLETE" "/qualia-verify {N}"
83
+ ```
@@ -40,9 +40,9 @@ One command for adding a small new capability outside the planned Road. Auto-det
40
40
 
41
41
  ## Process
42
42
 
43
- ### 0. Codex goal (Codex runtime only)
43
+ ### 0. Set the work-unit goal
44
44
 
45
- Per `rules/codex-goal.md` — set the thread goal with scope matching the auto-detected bucket (`quick` for inline, `feature` for spawn). Do this AFTER Step 2 (auto-detect scope) so the budget matches the actual work shape.
45
+ Per `rules/codex-goal.md` — set the work-unit goal (Codex `/goal`; on Claude Code, a tracked task + budget) with scope matching the auto-detected bucket (`quick` for inline, `feature` for spawn). Do this AFTER Step 2 (auto-detect scope) so the budget matches the actual work shape.
46
46
 
47
47
  ### 1. Capture description
48
48
 
@@ -50,6 +50,22 @@ If invoked without args, ask: **"What do you want to build?"**
50
50
 
51
51
  Wait for free-text answer. Don't paraphrase back. Capture the user's exact phrasing — it feeds both the auto-scope classifier and the eventual commit message.
52
52
 
53
+ ### 1b. Scope gate (anti-drift — keep work on the milestone arc)
54
+
55
+ Before building, check whether this work belongs to the active milestone. This is what stops feature/fix from drifting off-plan.
56
+
57
+ ```bash
58
+ node ${QUALIA_BIN}/state.js check 2>/dev/null # → milestone, profile; JOURNEY.md = the arc
59
+ node ${QUALIA_BIN}/state.js reqs-check 2>/dev/null # current milestone's open REQ-IDs
60
+ ```
61
+
62
+ - **No active project / no milestone** (`.planning/` absent) → not governed; proceed normally (skip to Step 2).
63
+ - **Active milestone**: decide if this work serves it.
64
+ - **In-scope** (it advances the current milestone's goal or an open REQ-ID) → proceed. Record it tagged to scope in Steps 4/5: add `--scope in --ref {REQ-ID or phase}` to the `state.js transition --to note` call.
65
+ - **Off-road** (a new capability/feature that isn't in the current milestone): this is exactly the drift the framework guards against. Resolve by profile (`state.js check` → `profile`):
66
+ - **strict** → STOP. Do not build off-road. Route to `/qualia-scope` to fold it into the arc (a phase/REQ in the current or a future milestone) or `/qualia-milestone` if it's a new milestone. Off-road building is blocked.
67
+ - **standard** → allowed, but **recorded**: build it, then record with `--scope off --ref "{what + why off-road}"` so the OWNER + ERP see the off-road tally (it is never silent).
68
+
53
69
  ### 2. Auto-detect scope
54
70
 
55
71
  Classify the description into one of three buckets:
@@ -116,7 +132,7 @@ git commit -m "fix: {description}"
116
132
  5. Record in state:
117
133
 
118
134
  ```bash
119
- node ${QUALIA_BIN}/state.js transition --to note --notes "{brief description}" --tasks-done 1
135
+ node ${QUALIA_BIN}/state.js transition --to note --notes "{brief description}" --tasks-done 1 {--scope in --ref {REQ/phase} | --scope off --ref "{why off-road}" — from the §1b scope gate}
120
136
  ```
121
137
 
122
138
  6. End with:
@@ -184,7 +200,7 @@ node ${QUALIA_BIN}/qualia-ui.js end "FEATURE SHIPPED (spawn)"
184
200
  5. Record in state:
185
201
 
186
202
  ```bash
187
- node ${QUALIA_BIN}/state.js transition --to note --notes "{description}" --tasks-done 1
203
+ node ${QUALIA_BIN}/state.js transition --to note --notes "{description}" --tasks-done 1 {--scope in --ref {REQ/phase} | --scope off --ref "{why off-road}" — from the §1b scope gate}
188
204
  ```
189
205
 
190
206
  ### 6. Execute the refuse path
@@ -48,6 +48,10 @@ Fix is the practical lane for "this used to work, or should work, and now it doe
48
48
  node ${QUALIA_BIN}/qualia-ui.js banner fix
49
49
  ```
50
50
 
51
+ ### 0. Set the work-unit goal
52
+
53
+ Per `rules/codex-goal.md` — set the work-unit goal (Codex `/goal`; on Claude Code, a tracked task + budget) with scope `quick` for `--quick`, else `feature`. Anchors the fix to one objective + budget so root-cause work doesn't sprawl.
54
+
51
55
  ### 1. Classify The Request
52
56
 
53
57
  Parse `$ARGUMENTS` into:
@@ -70,6 +74,14 @@ If the request is phase-sized, stop and route:
70
74
  node ${QUALIA_BIN}/qualia-ui.js end "ROUTED" "/qualia-plan"
71
75
  ```
72
76
 
77
+ ### 1b. Scope tag (anti-drift)
78
+
79
+ ```bash
80
+ node ${QUALIA_BIN}/state.js check 2>/dev/null # milestone + profile
81
+ ```
82
+
83
+ Repairing broken behavior in what the current milestone already built is **in-scope** — proceed, and tag the record `--scope in --ref {REQ/phase}` in Step 7. But a "fix" that is really **new off-road behavior** (a capability the milestone never included, dressed as a bug) is drift: in **strict** profile, STOP and route to `/qualia-scope` to fold it into the arc; in **standard**, proceed but record `--scope off --ref "{why off-road}"` so it's counted, never silent. No active milestone → not governed, proceed.
84
+
73
85
  ### 2. Build The Feedback Loop
74
86
 
75
87
  Use the cheapest check that can prove the bug is real and later prove it is fixed.
@@ -175,7 +187,7 @@ git commit -m "fix: {short symptom/root-cause summary}"
175
187
  Record state:
176
188
 
177
189
  ```bash
178
- node ${QUALIA_BIN}/state.js transition --to note --notes "{short fix summary}" --tasks-done 1
190
+ node ${QUALIA_BIN}/state.js transition --to note --notes "{short fix summary}" --tasks-done 1 {--scope in --ref {REQ/phase} | --scope off --ref "{why off-road}" — from the §1b scope tag}
179
191
  ```
180
192
 
181
193
  ### 8. Output
@@ -30,13 +30,17 @@ Triggered after `/qualia-verify` passes on the LAST phase of the current milesto
30
30
 
31
31
  ```bash
32
32
  node ${QUALIA_BIN}/state.js check
33
+ node ${QUALIA_BIN}/state.js reqs-check # this milestone's REQ-ID completion
33
34
  ```
34
35
 
35
- `state.js close-milestone` enforces two guards:
36
+ `state.js close-milestone` enforces three guards:
36
37
  - `MILESTONE_NOT_READY` — any phase not verified
37
38
  - `MILESTONE_TOO_SMALL` — milestone has < 2 phases
39
+ - `MILESTONE_REQS_INCOMPLETE` — a REQ-ID mapped to this milestone in REQUIREMENTS.md is not yet `Complete` (strict profile blocks; standard profile proceeds but the unfinished REQs are surfaced as `warnings` to log). This is what stops "finishing a milestone with scope still open."
38
40
 
39
- If either fires (without `--force`), stop and show the error. The user must verify remaining phases first (or add `--force` for explicit bypass on a preview/demo milestone).
41
+ If any fires (without `--force`), stop and show the error. Resolve before closing: verify remaining phases, finish the open requirements, or **explicitly defer** a requirement by moving it to `Out of Scope` in REQUIREMENTS.md (a conscious deferral, not silent). `--force` bypasses all three for retroactive bookkeeping only.
42
+
43
+ Run `reqs-check` first so the user sees exactly which requirements are still open before the close attempt — Step 4 (mark Complete) should already have flipped the finished ones.
40
44
 
41
45
  ### 1b. Demo-Extension Branch
42
46
 
@@ -59,7 +63,7 @@ If `PROJECT_TYPE=demo` AND `MILESTONE_COUNT=1`, the demo's one milestone is clos
59
63
  **If "Client signed — extend to full project":**
60
64
 
61
65
  1. Update `.planning/PROJECT.md` frontmatter: `project_type: full`.
62
- 2. Run a brief discovery top-up — invoke `/qualia-scope` in PROJECT MODE, but only ask §9-§14 (the full-project-only questions). This adds the milestone arc, compliance, integrations, content ownership, handoff team, and budget shape.
66
+ 2. Run a brief discovery top-up — invoke `/qualia-scope` in PROJECT MODE, but only ask §9–§15 (the full-project-only questions). This adds the **capability inventory** (the whole project's scope), the **whole-project definition of done**, shipping order, compliance, integrations, content ownership, handoff team, and budget shape.
63
67
  3. Spawn the roadmapper in `extend-to-full` mode (see prompt below). It reads the existing single milestone (now M1), the updated discovery, and produces a full JOURNEY.md with M2..M{N-1} sketches plus the Handoff milestone.
64
68
  4. Then proceed with the standard close-milestone flow (Steps 2-9) — M1 closes, M2 opens, the user is asked to continue.
65
69
 
@@ -75,11 +79,13 @@ Read your role: @${QUALIA_AGENTS}/roadmapper.md
75
79
 
76
80
  <task>
77
81
  The existing JOURNEY.md has 1 milestone (the demo, now M1 and shipped). Extend it
78
- into a 2-5 milestone arc to Handoff:
82
+ into the FULL milestone arc to Handoff — as many milestones as the agreed scope
83
+ needs (no cap), covering the entire capability inventory:
79
84
 
80
85
  - Keep M1 exactly as-is (it shipped).
81
- - Add M2..M{N-1} based on §9 of project-discovery.md (the milestone-arc question
82
- the user answered when converting from demo).
86
+ - Add M2..M{N-1} covering every capability in §9 of project-discovery.md (the
87
+ capability inventory), ordered per §11 (shipping order). Every §9 capability
88
+ must land in a milestone — nothing agreed is left unplanned.
83
89
  - Append a Handoff milestone (fixed 4 phases: Polish, Content + SEO, Final QA,
84
90
  Handoff).
85
91
  - Update REQUIREMENTS.md to add REQ-IDs for the new milestones.
@@ -59,8 +59,10 @@ Read your role: @${QUALIA_AGENTS}/research-synthesizer.md
59
59
 
60
60
  Merge the 4 research files at .planning/research/ into .planning/research/SUMMARY.md.
61
61
  This is a multi-milestone project -- the SUMMARY must suggest a FULL milestone arc
62
- (2-5 milestones including Handoff), not just a v1 phase list. Include roadmap
63
- implications AND handoff implications (what client takeover requires).
62
+ that covers the ENTIRE capability set to its done-state (as many milestones as the
63
+ scope needs, ending in Handoff for client projects -- no milestone cap), not just a
64
+ v1 phase list. Include roadmap implications AND handoff implications (what client
65
+ takeover requires).
64
66
  ", subagent_type="qualia-research-synthesizer", description="Synthesize research")
65
67
  ```
66
68
 
@@ -74,7 +76,7 @@ Read your role: @${QUALIA_AGENTS}/roadmapper.md
74
76
 
75
77
  <task>
76
78
  Create the FULL JOURNEY for this project:
77
- - .planning/JOURNEY.md -- all milestones (2-5 including Handoff) with exit criteria
79
+ - .planning/JOURNEY.md -- all milestones (2, no upper cap; ending in Handoff for client projects) covering every capability from discovery §9, with exit criteria
78
80
  - .planning/REQUIREMENTS.md -- requirements grouped by milestone
79
81
  - .planning/ROADMAP.md -- Milestone 1's phase detail (and ALL milestones if full_detail=true)
80
82
 
@@ -115,7 +117,7 @@ The branded journey ladder rendered in Step 11. Use `node ${QUALIA_BIN}/qualia-u
115
117
  ```
116
118
  ## Proposed Journey
117
119
 
118
- **{N} milestones to handoff** | **{X} requirements mapped** | All v1 requirements covered
120
+ **{N} milestones to handoff** | **{X}/{X} capabilities mapped** | Full §9 inventory covered (0 unmapped)
119
121
 
120
122
  +-- Milestone 1 . {Name} [CURRENT]
121
123
  | Why now: {one line}
@@ -58,7 +58,7 @@ Use **AskUserQuestion** (interactive UI — never a plain-text prompt):
58
58
  - question: "What kind of project is this? Pick one — it drives everything else."
59
59
  - options:
60
60
  - "Demo" — one shippable milestone, real backend, no mocks. Built to win a client conversation, extensible via `/qualia-milestone` if they sign. 8-question discovery.
61
- - "Full project" — the multi-milestone arc to Handoff. 2-5 milestones planned upfront. 14-question discovery.
61
+ - "Full project" — the full multi-milestone arc to Handoff, sized to the agreed capability set (no fixed milestone cap — the arc spans to done). 15-question discovery.
62
62
  - "Quick prototype" — landing page, throwaway, ≤1 day. Skips research and journey. (Equivalent to `--quick` flag.)
63
63
 
64
64
  Store the answer as `PROJECT_TYPE=demo` | `PROJECT_TYPE=full` | `PROJECT_TYPE=quick`. It drives every downstream step.
@@ -94,7 +94,7 @@ The shape is locked, now capture the content in one sentence:
94
94
 
95
95
  > **"What are you building? One sentence — a stranger should understand it."**
96
96
 
97
- Accept whatever the user says, even if broad. **Do NOT start an ad-hoc clarification round here.** Depth comes from the structured discovery interview in Step 4, not from free-form questioning. If the answer is "a SaaS platform" — that's fine, write it down, move on. `/qualia-scope` will refine it through its 8 or 14 structured questions.
97
+ Accept whatever the user says, even if broad. **Do NOT start an ad-hoc clarification round here.** Depth comes from the structured discovery interview in Step 4, not from free-form questioning. If the answer is "a SaaS platform" — that's fine, write it down, move on. `/qualia-scope` will refine it through its 8 or 15 structured questions.
98
98
 
99
99
  This is the ONLY free-text question in the kickoff flow. Everything else is `AskUserQuestion`.
100
100
 
@@ -102,7 +102,7 @@ This is the ONLY free-text question in the kickoff flow. Everything else is `Ask
102
102
 
103
103
  **Hard rule:** This is the next tool call after Step 3. No ad-hoc clarification, no free-form follow-up, no "let me ask a few quick things first." If the one-line pitch was "a SaaS platform", you invoke `/qualia-scope` NOW — that skill's structured questions are how breadth gets refined into depth.
104
104
 
105
- Invoke `/qualia-scope` inline in PROJECT MODE — non-technical kickoff interview. 8 questions for demo, 14 for full project. Pass `PROJECT_TYPE` so the scope skill skips the type question.
105
+ Invoke `/qualia-scope` inline in PROJECT MODE — non-technical kickoff interview. 8 questions for demo, 15 for full project. Pass `PROJECT_TYPE` so the scope skill skips the type question.
106
106
 
107
107
  This step REPLACES the old free-form "deep questioning" loop. The scope skill captures answers verbatim into `.planning/project-discovery.md`, which seeds PROJECT.md, PRODUCT.md, CONTEXT.md, and (for full projects) JOURNEY.md milestone names.
108
108
 
@@ -331,13 +331,13 @@ Read `.planning/research/FEATURES.md` and present the feature landscape. Feature
331
331
  For each category, use AskUserQuestion:
332
332
 
333
333
  - header: "{Category name}"
334
- - question: "Which {category} features belong to v1 (Milestones 1..N-1 excluding Handoff)?"
334
+ - question: "Which {category} features are part of THIS project (the full agreed scope)? Anything selected must land in a milestone — the arc is sized to fit, no cap."
335
335
  - multiSelect: true
336
- - options: each feature from FEATURES.md + "None for v1"
336
+ - options: each feature from FEATURES.md + "None of these"
337
337
 
338
338
  Track selections:
339
- - Selected → v1 scope (roadmapper assigns to specific milestones based on dependency order)
340
- - Unselected table stakes Post-Handoff v2 (users expect these)
339
+ - Selected → in scope (roadmapper assigns each to a specific milestone by dependency order; the arc grows to cover all of them)
340
+ - Unselected only goes to Post-Handoff/Out-of-Scope if the user EXPLICITLY defers it (matches discovery §8). Don't silently drop a table-stakes feature into v2 to keep the arc short.
341
341
  - Unselected differentiators → Out of Scope
342
342
 
343
343
  Gather any additional requirements the user wants that research missed.
@@ -350,12 +350,24 @@ node ${QUALIA_BIN}/qualia-ui.js banner roadmap
350
350
 
351
351
  **Roadmapper output branches on `PROJECT_TYPE`:**
352
352
 
353
- - **Demo** (`PROJECT_TYPE=demo`): roadmapper produces a 1-milestone JOURNEY.md (the demo milestone, 2-4 phases) plus a matching REQUIREMENTS.md and a fully-detailed ROADMAP.md. No "Handoff" milestone is appended — the demo is its own complete artifact. The journey-tree at Step 14 shows a single rung; the "extend to full project" branch is handled later by `/qualia-milestone` if the client signs.
354
- - **Full project** (`PROJECT_TYPE=full`): roadmapper produces the standard 2-5 milestone arc ending in Handoff. Milestone 1 fully detailed, M2..M{N-1} sketched (unless `--full-detail`).
353
+ - **Demo** (`PROJECT_TYPE=demo`): roadmapper produces a 1-milestone JOURNEY.md (the demo milestone, 2-4 phases) plus a matching REQUIREMENTS.md and a fully-detailed ROADMAP.md. No "Handoff" milestone is appended — the demo is its own complete artifact. The journey-tree at Step 15 shows a single rung; the "extend to full project" branch is handled later by `/qualia-milestone` if the client signs.
354
+ - **Full project** (`PROJECT_TYPE=full`): roadmapper produces the full milestone arc ending in Handoff (client projects), sized to cover the entire capability inventory from discovery §9 — no fixed milestone count. Milestone 1 fully detailed, M2..M{N-1} sketched (unless `--full-detail`).
355
355
 
356
356
  Spawn the roadmapper with `<project_type>$PROJECT_TYPE</project_type>` in the prompt. If the user passed `--full-detail`, include `<full_detail>true</full_detail>` so the roadmapper writes complete phase detail for ALL milestones (full project only; demo always has full detail because there's only one milestone). See REFERENCE.md section "Roadmapper prompt" for the verbatim prompt template.
357
357
 
358
- ### Step 14. Present the Journey (single view)
358
+ ### Step 14. Coverage gate (before presenting — the genesis teeth)
359
+
360
+ Before showing the journey, verify the arc actually covers the whole agreed scope. This is what stops the framework from generating "milestones that don't finish the project."
361
+
362
+ 1. Read the capability inventory: `.planning/project-discovery.md` §9 (full projects).
363
+ 2. Read `.planning/REQUIREMENTS.md` traceability.
364
+ 3. Confirm **every §9 capability has a REQ-ID mapped to a milestone** (Unmapped = 0), and the only items in `Post-Handoff (v2)` / `Out of Scope` are the ones the client explicitly deferred in §8.
365
+
366
+ If any §9 capability is unmapped, or agreed work was pushed to v2 to shorten the arc → **do NOT present for approval.** Re-spawn the roadmapper with the gap list and instruction to extend the arc until coverage is complete. Only proceed to the ladder when coverage is 100%.
367
+
368
+ (Demo projects skip this — they're a single milestone with no §9 inventory.)
369
+
370
+ ### Step 15. Present the Journey (single view)
359
371
 
360
372
  Render the branded journey ladder:
361
373
 
@@ -367,7 +379,7 @@ This shows M1..M{N} as a vertical ladder: shipped milestones get a green dot, cu
367
379
 
368
380
  Also narrate the one-glance summary. See REFERENCE.md section "Journey ladder format" for the ASCII template.
369
381
 
370
- ### Step 15. Approval Gate (single — for the whole journey)
382
+ ### Step 16. Approval Gate (single — for the whole journey)
371
383
 
372
384
  - header: "Journey"
373
385
  - question: "Does this journey work for you?"
@@ -397,7 +409,7 @@ node ${QUALIA_BIN}/qualia-ui.js info "Full phase detail for each later milestone
397
409
 
398
410
  (Skip this block when `--full-detail` was used — all milestones are already fully planned in that case.)
399
411
 
400
- ### Step 16. Environment Setup
412
+ ### Step 17. Environment Setup
401
413
 
402
414
  Supabase project? `supabase link` or create. Vercel project? `vercel link`. Env vars? `.env.local` with placeholders from PROJECT.md stack.
403
415
 
@@ -408,7 +420,7 @@ git add .gitignore
408
420
  git commit -m "chore: environment setup" 2>/dev/null
409
421
  ```
410
422
 
411
- ### Step 17. Auto-Apply Gate (or stop here)
423
+ ### Step 18. Auto-Apply Gate (or stop here)
412
424
 
413
425
  If invoked with `--auto`, skip straight into building Milestone 1:
414
426
 
@@ -465,10 +477,10 @@ Do NOT use `--quick` for: client projects, anything with compliance stakes, anyt
465
477
  1a. **One stack question, then a real scaffold.** Step 5a asks exactly ONE stack-preset question (`landing-page` / `full-app` / `ai-agent` / `internal-tool`) and then instantiates it: copies `scaffold/` into the project (never clobbering existing files), copies `env.required.json` → `.planning/env.required.json`, and copies `phases.md` → `.planning/phases.md`. The project starts from a runnable skeleton, not an empty folder. Skip this step on `--quick`. Record the choice as `config.json.stack_id`.
466
478
  2. **AskUserQuestion for every discrete-choice question.** Project type, brownfield gate, design vibe, client type, approval gate, auto-chain — all use the interactive UI. The ONLY free-text question in the kickoff flow is the Step 3 one-line pitch. No plain-text prompts for anything that has a closed set of answers.
467
479
  3. **No ad-hoc clarification questioning.** After Step 3 (one-line pitch), the next tool call is `/qualia-scope`. No "let me ask a few quick things first", no "that's too broad, can you clarify". Depth is the scope skill's job — not yours.
468
- 4. **Discovery interview is mandatory.** Step 4 always invokes `/qualia-scope` in PROJECT MODE. No free-form questioning loop, no "I'll just sketch PROJECT.md from the user's first message." The interview is 8 questions for demo, 14 for full project.
480
+ 4. **Discovery interview is mandatory.** Step 4 always invokes `/qualia-scope` in PROJECT MODE. No free-form questioning loop, no "I'll just sketch PROJECT.md from the user's first message." The interview is 8 questions for demo, 15 for full project.
469
481
  5. **Research runs automatically.** No permission ask. Only `--quick` skips it. Demo path uses `<scope>quick</scope>` (3-call budget per researcher); full project uses standard 8-call budget.
470
482
  6. **Demo design philosophy is non-negotiable.** Real backend always (Supabase, real auth), DESIGN.md mandatory, slop-detect hard-block, 1 milestone, focus on real agent/platform functionality + design quality. No mock data, no lorem ipsum, no broken flows. Speed comes from skipping multi-milestone planning, never from skipping design quality, mocking the backend, or cutting corners on the core flow. A demo that uses mock data is not a Qualia demo.
471
- 7. **Demos are 1 milestone, full projects are 2-5.** Demo journeys have no "Handoff" — the demo IS the artifact. Full projects always end in Handoff (fixed 4 phases). The journey-tree adapts to both shapes.
483
+ 7. **Demos are 1 milestone; full projects are 2+ (as many as the agreed scope needs — no cap).** Demo journeys have no "Handoff" — the demo IS the artifact. Full client projects end in Handoff (fixed 4 phases); internal/ongoing products may end at their done-state milestone. The journey-tree adapts to any length. A full project's arc must cover every capability from discovery §9 — never trimmed to hit a milestone count.
472
484
  8. **The full-project journey includes Handoff.** Every full project's final milestone is literally named "Handoff" with 4 standard phases. The roadmapper enforces this.
473
485
  9. **Single approval gate.** One gate for the whole journey. Not per-milestone, not per-phase.
474
486
  10. **Milestone 1 is fully detailed (full projects).** M2..M{N-1} are sketched. Detail fills in when each milestone opens. Demos are always fully detailed because they're 1 milestone.
@@ -27,9 +27,9 @@ Spawn planner to break phase into tasks, validate with checker (max 2 revision c
27
27
 
28
28
  ## Process
29
29
 
30
- ### 0. Codex goal (Codex runtime only)
30
+ ### 0. Set the work-unit goal
31
31
 
32
- Per `rules/codex-goal.md` — set the thread goal at plan start with scope `phase`. The objective covers both planning and the subsequent build, so a single goal-set at this stage is enough.
32
+ Per `rules/codex-goal.md` — set the work-unit goal at plan start with scope `phase` (Codex `/goal`; on Claude Code, a tracked task + budget). The objective covers both planning and the subsequent build, so a single goal-set at this stage is enough.
33
33
 
34
34
  ### 1. Determine Phase & Load Context
35
35
 
@@ -123,6 +123,16 @@ if [ "$DRY_RUN" != "true" ] && [ -d .git ]; then
123
123
  fi
124
124
  ```
125
125
 
126
+ ### Step 5b — Branch hygiene sweep (clock-out safety net)
127
+
128
+ Ship integrates every deploy into `main`, but work that was built and never shipped strands on a branch, and review PRs can linger. Surface both before the employee leaves — informational, never blocks:
129
+
130
+ ```bash
131
+ node ${QUALIA_BIN}/branch-hygiene.js
132
+ ```
133
+
134
+ Exit 1 → it lists branches with unshipped commits ahead of `main` (run `/qualia-ship` or merge them, or delete if abandoned) and any stale open PRs. Exit 0 → nothing stranded. Include the summary in the closing message so the OWNER sees it in the report.
135
+
126
136
  ### Step 6 — Upload to ERP
127
137
 
128
138
  The full payload-builder + 3-attempt-retry logic lives unchanged from v4 — see the **ERP Upload** section below for the canonical implementation. Behavior summary:
@@ -45,7 +45,7 @@ The non-technical conversation at the start of `/qualia-new`, BEFORE roadmapping
45
45
 
46
46
  ### P1. Project type (or accept it from `/qualia-new`)
47
47
 
48
- If `/qualia-new` already asked the Demo vs Full gate (its literal first question), it passes `PROJECT_TYPE=demo` | `PROJECT_TYPE=full` via env/arg — **skip the gate, do not re-ask.** Only ask it when invoked standalone, via **AskUserQuestion** (header "Project shape"): "Demo (single shippable milestone, sales conversation)" vs "Full project (multi-milestone arc to Handoff)". Demo runs §1–§8 of the discovery template; Full runs all 14.
48
+ If `/qualia-new` already asked the Demo vs Full gate (its literal first question), it passes `PROJECT_TYPE=demo` | `PROJECT_TYPE=full` via env/arg — **skip the gate, do not re-ask.** Only ask it when invoked standalone, via **AskUserQuestion** (header "Project shape"): "Demo (single shippable milestone, sales conversation)" vs "Full project (multi-milestone arc to Handoff)". Demo runs §1–§8 of the discovery template; Full runs §1–§15 (adds the capability-completeness pass + delivery questions).
49
49
 
50
50
  ### P2. Open
51
51
 
@@ -53,11 +53,11 @@ If `/qualia-new` already asked the Demo vs Full gate (its literal first question
53
53
  node ${QUALIA_BIN}/qualia-ui.js banner scope 2>/dev/null || true
54
54
  ```
55
55
 
56
- Say **"Eight quick questions for the demo path"** or **"Fourteen questions to shape the full project — we'll move fast."**
56
+ Say **"Eight quick questions for the demo path"** or **"Fifteen questions to shape the full project — we'll move fast. The middle ones map out everything the project needs to be DONE."**
57
57
 
58
58
  ### P3. One question at a time, from `templates/project-discovery.md`
59
59
 
60
- For each §1..§8 (demo) or §1..§14 (full), ask in plain language:
60
+ For each §1..§8 (demo) or §1..§15 (full), ask in plain language:
61
61
 
62
62
  ```
63
63
  **Question {N}/{total}:** {question text from the template}