qualia-framework 5.8.0 → 5.9.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/agents/plan-checker.md +8 -0
- package/agents/qa-browser.md +7 -0
- package/agents/roadmapper.md +8 -0
- package/agents/verifier.md +14 -1
- package/bin/cli.js +30 -1
- package/bin/erp-retry.js +289 -0
- package/bin/install.js +6 -0
- package/bin/state.js +10 -1
- package/docs/onboarding.html +3 -5
- package/docs/playwright-loop-pilot-results.md +7 -5
- package/docs/research/2026-05-11-deep-research.md +189 -0
- package/hooks/session-start.js +18 -0
- package/package.json +3 -2
- package/rules/speed.md +1 -2
- package/skills/qualia-discuss/SKILL.md +4 -2
- package/skills/qualia-new/SKILL.md +71 -43
- package/skills/qualia-report/SKILL.md +64 -2
- package/skills/qualia-verify/SKILL.md +16 -0
- package/templates/help.html +2 -3
- package/tests/bin.test.sh +23 -5
- package/tests/refs.test.sh +146 -0
|
@@ -0,0 +1,189 @@
|
|
|
1
|
+
# Qualia Framework — Deep Research Audit
|
|
2
|
+
|
|
3
|
+
**Date:** 2026-05-11
|
|
4
|
+
**Version audited:** v5.8.0 (tip `387c422`)
|
|
5
|
+
**Auditors:** 4 parallel investigators (surface health, ERP integration, token economy + personalization, workflow outcomes)
|
|
6
|
+
**Method:** Grounded protocol — every claim carries `file:line` citation with quoted snippet.
|
|
7
|
+
|
|
8
|
+
---
|
|
9
|
+
|
|
10
|
+
## Headline verdict
|
|
11
|
+
|
|
12
|
+
The framework is **structurally solid** but carries three real failure modes the user was right to suspect:
|
|
13
|
+
|
|
14
|
+
1. **Documentation drift is the loudest silent failure.** Three user-facing surfaces (`rules/speed.md`, `templates/help.html`, `docs/onboarding.html`) still list `/qualia-quick`, `/qualia-task`, `/qualia-design`, `/qualia-prd`, `/qualia-polish-loop` — commands removed in v5.7/v5.8. A new hire following onboarding.html immediately hits dead ends.
|
|
15
|
+
2. **ERP health is better than the user feared, but one promise is a lie.** After 3 failed upload attempts the message says "will appear in ERP after retry" — there is no retry mechanism. No queue, no cron, no session-start re-try. Data sits locally until the employee manually re-runs `/qualia-report`. The retry logic that DOES exist (1s/3s/9s backoff, 401/422 permanent-fail distinction) is correct.
|
|
16
|
+
3. **Always-loaded substrate is ~2× larger than the "Pocock discipline" claim implies.** CLAUDE.md is genuinely 24 lines, but the 8 rules files (~480 lines) + 33 skill descriptions (~14.7 KB) total **~10,300 tokens** on every session start. ~5,400 of those are recoverable without losing functionality.
|
|
17
|
+
|
|
18
|
+
Production-readiness score (framework as a product): **77 / 100**
|
|
19
|
+
|
|
20
|
+
| Dimension | Score | One-line |
|
|
21
|
+
|---|---:|---|
|
|
22
|
+
| Surface honesty | 6/10 | Dead refs in 3 user-facing files |
|
|
23
|
+
| ERP health | 7/10 | Real retry, false retry-promise, missing idempotency |
|
|
24
|
+
| Token discipline | 6/10 | Real where claimed, but ~5.4K tokens of recoverable bloat |
|
|
25
|
+
| Personalization | 3/10 | 4 employees are identical clones in the framework's eyes |
|
|
26
|
+
| Workflow speed | 7/10 | Road works; kickoff has redundant questions, no fast path |
|
|
27
|
+
| Verifier strictness | 7/10 | Strong protocol, INSUFFICIENT EVIDENCE silently treated as PASS |
|
|
28
|
+
| Test coverage | 8/10 | State machine excellently tested, workflow loop untested |
|
|
29
|
+
| Hooks (safety) | 9/10 | Genuinely well-engineered, zero token tax, real enforcement |
|
|
30
|
+
|
|
31
|
+
---
|
|
32
|
+
|
|
33
|
+
## CRITICAL findings (zero this round)
|
|
34
|
+
|
|
35
|
+
None of the four audits surfaced a CRITICAL severity issue (security breach / data loss / auth bypass / crash on happy path). The framework's safety hooks (`branch-guard`, `git-guardrails`, `migration-guard`, `pre-deploy-gate`) and state-machine locking are real defenses.
|
|
36
|
+
|
|
37
|
+
This is meaningful: the user's "I think this is very fucked up" suspicion about the ERP was wrong at the data-safety level — local commit always happens before upload, retry logic is correct, API key file is mode `0600`, no shell injection.
|
|
38
|
+
|
|
39
|
+
---
|
|
40
|
+
|
|
41
|
+
## HIGH findings (10 — fix before next minor)
|
|
42
|
+
|
|
43
|
+
### Surface honesty
|
|
44
|
+
|
|
45
|
+
**H1. `rules/speed.md:50-51` lists removed `/qualia-quick` + `/qualia-task` as active.**
|
|
46
|
+
`speed.md` is read by users looking for shortcuts. Users will invoke commands that don't exist.
|
|
47
|
+
|
|
48
|
+
**H2. `templates/help.html:377-379` shows `/qualia-quick`, `/qualia-task`, `/qualia-design`.**
|
|
49
|
+
This is what `/qualia-help` opens in the browser. It's the canonical reference page.
|
|
50
|
+
|
|
51
|
+
**H3. `docs/onboarding.html:509, 524, 525, 556` shows 4 removed commands.**
|
|
52
|
+
This is the file the README explicitly recommends sending to new hires.
|
|
53
|
+
|
|
54
|
+
### ERP
|
|
55
|
+
|
|
56
|
+
**H4. `skills/qualia-report/SKILL.md:238` — "will appear in ERP after retry" is a lie.**
|
|
57
|
+
There is no retry mechanism. No background queue, no cron, no session-start drain. Either build a retry queue at `~/.claude/.erp-retry-queue.json` and drain on session-start, OR change the message to honest "Re-run /qualia-report to retry."
|
|
58
|
+
|
|
59
|
+
**H5. `skills/qualia-report/SKILL.md:220-222` — no `Idempotency-Key` header sent.**
|
|
60
|
+
The ERP contract documents idempotency support with a 24h replay window (`docs/erp-contract.md:42-49`). The framework ignores this. The UPSERT on `(project_id, client_report_id)` covers most cases, but retries after a response-lost-mid-flight could double-count without the explicit header.
|
|
61
|
+
|
|
62
|
+
**H6. `session_duration_minutes` documented in ERP contract example but never sent.**
|
|
63
|
+
`docs/erp-contract.md:93` shows it; the payload builder at `skills/qualia-report/SKILL.md:192-205` never computes it. Trivial fix: `Math.round((Date.now() - new Date(t.session_started_at)) / 60000)`.
|
|
64
|
+
|
|
65
|
+
### Workflow quality
|
|
66
|
+
|
|
67
|
+
**H7. `agents/verifier.md:47` — 25-call tool budget + `skills/qualia-verify/SKILL.md` doesn't block on INSUFFICIENT EVIDENCE.**
|
|
68
|
+
The verifier is told "mark unchecked criteria as INSUFFICIENT EVIDENCE" when budget exhausts. The orchestrator does not grep for that string before declaring PASS. Phases with 8+ tasks can pass verification with criteria literally not checked. **This is the #1 false-pass vector.**
|
|
69
|
+
|
|
70
|
+
**H8. `/qualia-new` has 15-21+ user questions before any code is written.**
|
|
71
|
+
14 discovery + 1 design vibe + 1 client + 5 PRODUCT.md + N feature scoping. The PRODUCT.md questions at `skills/qualia-new/SKILL.md:163-169` overlap with discovery questions 2-5. Demos hit ~15 interactions despite the "8 questions for demos" framing.
|
|
72
|
+
|
|
73
|
+
### Personalization
|
|
74
|
+
|
|
75
|
+
**H9. All 4 employees (Hasan, Moayad, Rama, Sally) have identical role descriptions.**
|
|
76
|
+
`bin/install.js:31-57` — every EMPLOYEE entry has the description "Developer. Feature branches only. Cannot push to main." No stack expertise, seniority, specialization. The framework cannot adapt explanation depth, task assignment, or review style per developer.
|
|
77
|
+
|
|
78
|
+
**H10. `architecture.md` is always-loaded but explicitly says "Do not auto-load this on quick fixes."**
|
|
79
|
+
`rules/architecture.md` is 125 lines (~1,560 tokens). It lives in `~/.claude/rules/` where Claude Code auto-loads everything. The file itself contradicts its location.
|
|
80
|
+
|
|
81
|
+
---
|
|
82
|
+
|
|
83
|
+
## MEDIUM findings (12 — fix this quarter)
|
|
84
|
+
|
|
85
|
+
| # | Where | What |
|
|
86
|
+
|---|---|---|
|
|
87
|
+
| M1 | `bin/state.js:332` | Progress bar formula `(phase-1)/total_phases` — completed project shows 66%, never 100% |
|
|
88
|
+
| M2 | `agents/builder.md:155`, `agents/planner.md:147`, `agents/research-synthesizer.md:91` | "likely", "probably" — hedging language the grounding protocol explicitly bans |
|
|
89
|
+
| M3 | `bin/state.js:375-376` | `polished → shipped` mandatory; no skip for API-only / backend-only projects |
|
|
90
|
+
| M4 | `skills/zoho-workflow/` | Completely unreferenced — orphan skill |
|
|
91
|
+
| M5 | `tests/skills.test.sh` | Tests structure, not behavior — no integration test for the plan→build→verify loop |
|
|
92
|
+
| M6 | `agents/plan-checker.md:83` | "Any shared file = wave conflict" forces unnecessary serialization (no read-only-overlap distinction) |
|
|
93
|
+
| M7 | `agents/plan-checker.md:173-179` | Scope-reduction Rule 10 substring-matches "v1", "basic version" — false REVISE on legit plans |
|
|
94
|
+
| M8 | `agents/verifier.md:315-317` + `:356-362` | Design rubric fires full 8-dim on any `.tsx` file presence — backend-heavy phases get disproportionate design scrutiny |
|
|
95
|
+
| M9 | `bin/cli.js:874-888` | `erp-ping` sends synthetic payload — does not validate current schema |
|
|
96
|
+
| M10 | `skills/qualia-report/SKILL.md:197` | `framework_version` reads from config snapshot at install time — stale after `npm update` |
|
|
97
|
+
| M11 | `skills/qualia-new/SKILL.md:163-169` | PRODUCT.md questions duplicate discovery questions 2-5 |
|
|
98
|
+
| M12 | `skills/qualia/SKILL.md:38-55` | Router has no row for "I want a one-off change outside the Road" — users must already know `/qualia-feature` exists |
|
|
99
|
+
|
|
100
|
+
---
|
|
101
|
+
|
|
102
|
+
## LOW findings (5)
|
|
103
|
+
|
|
104
|
+
- L1. `hooks/pre-compact.js:80-81` — default `--no-verify` + `--no-gpg-sign` (configurable, documented, but silent default)
|
|
105
|
+
- L2. `docs/playwright-loop-pilot-results.md:7,16,56,113` — references the renamed `skills/qualia-polish-loop/` path
|
|
106
|
+
- L3. `skills/qualia-report/SKILL.md:152` — error table shows the old `set-erp-key <key>` positional syntax (CLI now requires piped)
|
|
107
|
+
- L4. `bin/state.js:130` — trace probabilistic pruning (1% chance) may let `.qualia-traces/` grow unnecessarily on heavy installs
|
|
108
|
+
- L5. `agents/visual-evaluator.md:91` — `likely_file` (as JSON field name, borderline)
|
|
109
|
+
|
|
110
|
+
---
|
|
111
|
+
|
|
112
|
+
## Cross-cutting patterns
|
|
113
|
+
|
|
114
|
+
### Pattern 1: Surface drift outpaces test coverage
|
|
115
|
+
|
|
116
|
+
`tests/skills.test.sh` validates that every SKILL.md has the right frontmatter. It does NOT validate that command references inside SKILL.md, rules/, docs/, templates/ point to skills that still exist. Three v5.7/v5.8 removals slipped through because the test surface is structural, not referential.
|
|
117
|
+
|
|
118
|
+
**Fix:** Add `tests/refs.test.sh` that greps every `.md` and `.html` for `/qualia-{name}` and asserts each name has a matching `skills/qualia-{name}/SKILL.md`.
|
|
119
|
+
|
|
120
|
+
### Pattern 2: The framework is more disciplined than its tooling enforces
|
|
121
|
+
|
|
122
|
+
CLAUDE.md is genuinely lean. Design substrate was correctly moved off the always-loaded path. Hooks enforce deterministically. But:
|
|
123
|
+
|
|
124
|
+
- `rules/architecture.md` lives where it auto-loads despite its own warning
|
|
125
|
+
- Skill descriptions accumulate flavor text ("Karpathy-style", "v5.3 from Matt Pocock's...") on top of trigger phrases — every change invalidates the cache prefix
|
|
126
|
+
- 8 rules files always-load when 3 are sufficient for most sessions
|
|
127
|
+
|
|
128
|
+
**Fix:** A `tests/budget.test.sh` that asserts total always-loaded substrate stays under ~6,000 tokens.
|
|
129
|
+
|
|
130
|
+
### Pattern 3: The verifier knows what to check but can't always afford to check it
|
|
131
|
+
|
|
132
|
+
The 3-level verification (Truths / Artifacts / Wiring) is the right abstraction. The 25-call budget makes it un-affordable on phases with 8+ tasks. INSUFFICIENT EVIDENCE is the escape hatch, but the orchestrator doesn't punish it. So the verifier silently approves under-verified phases.
|
|
133
|
+
|
|
134
|
+
**Fix:** Budget = `max(25, tasks * 5)`. AND: any INSUFFICIENT EVIDENCE in the verification file → verdict downgraded to FAIL.
|
|
135
|
+
|
|
136
|
+
### Pattern 4: Personalization is structurally absent
|
|
137
|
+
|
|
138
|
+
There are 4 distinct humans (Hasan, Moayad, Rama, Sally) who get an identical one-sentence description. The daily-log, knowledge layer, learned-patterns, and commit history all contain per-user signal that is collected and never read for personalization.
|
|
139
|
+
|
|
140
|
+
**Fix:** Per-employee profile file under `~/.claude/team/{code}.md`, injected into CLAUDE.md template at install time. Auto-derived from daily logs via a `/qualia-flush` extension.
|
|
141
|
+
|
|
142
|
+
---
|
|
143
|
+
|
|
144
|
+
## Top 10 fixes ranked by ROI
|
|
145
|
+
|
|
146
|
+
1. **Find-and-replace the 6 dead command references in `speed.md`, `help.html`, `onboarding.html`** — 15 min. Eliminates every user-facing dead end.
|
|
147
|
+
2. **Add INSUFFICIENT EVIDENCE blocker to `qualia-verify`** — 20 min. Eliminates the #1 false-pass vector.
|
|
148
|
+
3. **Stop lying about retry: either build the queue or change the message** — 30 min for the message change, ~3 hours for the queue.
|
|
149
|
+
4. **Send `Idempotency-Key` + `session_duration_minutes` in ERP payload** — 20 min. Completes the documented contract.
|
|
150
|
+
5. **Move `architecture.md` to `~/.claude/qualia-substrate/` (lazy-load)** — 10 min. Saves ~1,560 tokens per session.
|
|
151
|
+
6. **Trim skill descriptions to trigger-phrases-only** — 1 hour. Saves ~1,500 tokens per session + improves cache stability.
|
|
152
|
+
7. **Per-employee profile files (`team/{code}.md`)** — 2 hours. Transforms personalization from 0 to material.
|
|
153
|
+
8. **Scale verifier budget to `max(25, tasks*5)`** — 5 min. Eliminates INSUFFICIENT EVIDENCE on large phases.
|
|
154
|
+
9. **Merge PRODUCT.md questions into the discovery interview** — 30 min. Cuts ~3 questions from every kickoff.
|
|
155
|
+
10. **Add `tests/refs.test.sh`** — 45 min. Prevents the next surface-drift incident.
|
|
156
|
+
|
|
157
|
+
**Total effort for all 10:** ~8-10 hours.
|
|
158
|
+
**Effort-weighted impact:** removes every found false-pass vector, every documented dead-link, ~3K of recoverable tokens, the framework's biggest UX papercut (personalization), and the largest invisible quality risk (INSUFFICIENT EVIDENCE silently passing).
|
|
159
|
+
|
|
160
|
+
---
|
|
161
|
+
|
|
162
|
+
## What the framework does WELL (honest acknowledgement)
|
|
163
|
+
|
|
164
|
+
These are not patronizing — they came up across all four audits.
|
|
165
|
+
|
|
166
|
+
1. **State machine** (`bin/state.js`). 57 behavioral tests. Atomic dual-file writes. File-based locking with stale detection. Crash-recovery journaling. Gap-cycle circuit breaker with configurable limit. Schema validation + repair. This is real engineering.
|
|
167
|
+
2. **Hook architecture.** Pure Node.js (Windows-safe). Zero model-token tax (deterministic enforcement, not instructional). Real protections: service_role leak scan, force-push-to-main block (role-aware), migration safety, Vercel account guard, env-empty guard.
|
|
168
|
+
3. **Verifier abstraction** (Truths / Artifacts / Wiring). Most AI coding tools check "did the task run." The Qualia verifier checks "is the artifact substantive, is it imported, is it called." The stub-detection patterns are operational learning, not theory.
|
|
169
|
+
4. **Polish-loop kill-switch.** Fingerprint regression detection + budget cap. Real engineering against the known infinite-loop failure mode of vision-model feedback loops.
|
|
170
|
+
5. **ERP security hardening.** Native `https.request` instead of curl (no bearer in `/proc/cmdline`). API key mode `0600`. Refuses positional CLI args. Env-var passing in payload builder (no shell injection). Atomic tmp+rename writes.
|
|
171
|
+
|
|
172
|
+
---
|
|
173
|
+
|
|
174
|
+
## Resources & references
|
|
175
|
+
|
|
176
|
+
- All findings cite specific files. Original investigator outputs from this audit are not committed (held in conversation context).
|
|
177
|
+
- Grounding protocol: `/home/qualia-new/.claude/rules/grounding.md`
|
|
178
|
+
- Severity criteria: same file.
|
|
179
|
+
- Pocock instruction-budget pattern: referenced in README:8.
|
|
180
|
+
- ERP contract spec: `docs/erp-contract.md` (this file exists and was used by the ERP-integration audit).
|
|
181
|
+
|
|
182
|
+
---
|
|
183
|
+
|
|
184
|
+
## Open questions for the user
|
|
185
|
+
|
|
186
|
+
1. **The retry-queue approach for ERP** — would you prefer (a) an honest message change ("re-run to retry"), (b) a queue file drained on session-start, or (c) a cron job? Each has different operational characteristics.
|
|
187
|
+
2. **Personalization depth** — willing to spend an afternoon writing 4 profile files for Hasan/Moayad/Rama/Sally? Or want this auto-derived from daily logs over time?
|
|
188
|
+
3. **Token cuts vs cache stability** — some of the cuts (e.g. trimming skill descriptions) will invalidate prompt caches for one cycle. Worth it once, or keep stable?
|
|
189
|
+
4. **The 14-question discovery interview** — keep depth at the kickoff cost, or shave 4-5 questions and accept slightly fuzzier project framing?
|
package/hooks/session-start.js
CHANGED
|
@@ -26,6 +26,8 @@ const STATE_FILE = path.join(".planning", "STATE.md");
|
|
|
26
26
|
const CONTINUE_HERE = ".continue-here.md";
|
|
27
27
|
const NOTIF_FILE = path.join(HOME, ".claude", ".qualia-update-available.json");
|
|
28
28
|
const HEALTH_FILE = path.join(HOME, ".claude", ".qualia-install-health.json");
|
|
29
|
+
const ERP_RETRY = path.join(HOME, ".claude", "bin", "erp-retry.js");
|
|
30
|
+
const ERP_QUEUE = path.join(HOME, ".claude", ".erp-retry-queue.json");
|
|
29
31
|
|
|
30
32
|
// Critical files referenced by skills via @-import. If any are missing, skills
|
|
31
33
|
// silently get empty context and produce ungrounded output. We spot-check these
|
|
@@ -115,6 +117,21 @@ function fallbackText() {
|
|
|
115
117
|
}
|
|
116
118
|
}
|
|
117
119
|
|
|
120
|
+
function maybeDrainErpQueue() {
|
|
121
|
+
// Fire-and-forget drain of any reports stranded from a prior /qualia-report
|
|
122
|
+
// upload failure. Cheap fast-path: only spawn if the queue file exists and
|
|
123
|
+
// erp-retry.js is installed. Quiet mode + small max so we never block the
|
|
124
|
+
// session-start critical path. The script itself exits 0 even on internal
|
|
125
|
+
// errors — see erp-retry.js's CLI tail.
|
|
126
|
+
try {
|
|
127
|
+
if (!fs.existsSync(ERP_QUEUE) || !fs.existsSync(ERP_RETRY)) return;
|
|
128
|
+
spawnSync(process.execPath, [ERP_RETRY, "drain", "--quiet", "--max=5", "--timeout=2500"], {
|
|
129
|
+
stdio: "ignore",
|
|
130
|
+
timeout: 8000,
|
|
131
|
+
});
|
|
132
|
+
} catch {}
|
|
133
|
+
}
|
|
134
|
+
|
|
118
135
|
function maybeRenderUpdateBanner() {
|
|
119
136
|
// EMPLOYEE-only sticky banner. auto-update.js writes NOTIF_FILE when a new
|
|
120
137
|
// version is detected; we render it every session until the user actually
|
|
@@ -144,6 +161,7 @@ function renderHealthWarning(missing) {
|
|
|
144
161
|
|
|
145
162
|
try {
|
|
146
163
|
maybeRenderUpdateBanner();
|
|
164
|
+
maybeDrainErpQueue();
|
|
147
165
|
|
|
148
166
|
const healthMissing = checkInstallHealth();
|
|
149
167
|
if (healthMissing) renderHealthWarning(healthMissing);
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "qualia-framework",
|
|
3
|
-
"version": "5.
|
|
3
|
+
"version": "5.9.1",
|
|
4
4
|
"description": "Claude Code workflow framework by Qualia Solutions. Plan, build, verify, ship.",
|
|
5
5
|
"bin": {
|
|
6
6
|
"qualia-framework": "./bin/cli.js"
|
|
@@ -31,7 +31,8 @@
|
|
|
31
31
|
"test:skills": "bash tests/skills.test.sh",
|
|
32
32
|
"test:slop-detect": "bash tests/slop-detect.test.sh",
|
|
33
33
|
"test:statusline": "bash tests/statusline.test.sh",
|
|
34
|
-
"test:
|
|
34
|
+
"test:refs": "bash tests/refs.test.sh",
|
|
35
|
+
"test:shell": "bash tests/statusline.test.sh && bash tests/state.test.sh && bash tests/hooks.test.sh && bash tests/bin.test.sh && bash tests/lib.test.sh && bash tests/skills.test.sh && bash tests/refs.test.sh && bash tests/slop-detect.test.sh"
|
|
35
36
|
},
|
|
36
37
|
"files": [
|
|
37
38
|
"bin/",
|
package/rules/speed.md
CHANGED
|
@@ -47,8 +47,7 @@ The pattern: **on-demand by default; always-on only when the data is irreducibly
|
|
|
47
47
|
|
|
48
48
|
When a Qualia command exists for the situation, use it — don't reinvent:
|
|
49
49
|
- `/qualia` — what's my next step?
|
|
50
|
-
- `/qualia-
|
|
51
|
-
- `/qualia-task` — single focused task, fresh builder spawn, atomic commit
|
|
50
|
+
- `/qualia-feature` — single feature, auto-scoped: inline for trivia, fresh builder spawn for 1-5 file features
|
|
52
51
|
- `/qualia-ship` — full deploy pipeline (quality gates → commit → deploy → verify)
|
|
53
52
|
- `/qualia-review` — production audit
|
|
54
53
|
- `/qualia-pause` — save context before clearing the conversation
|
|
@@ -51,10 +51,12 @@ Hard rule: **never go technical here.** No "Should we use Supabase or Postgres?"
|
|
|
51
51
|
|
|
52
52
|
### P1. Detect project type (or accept it from `/qualia-new`)
|
|
53
53
|
|
|
54
|
-
If `/qualia-new` already asked the Demo vs Full gate, it passes the type in as `PROJECT_TYPE=demo` or `PROJECT_TYPE=full` via env or arg
|
|
54
|
+
If `/qualia-new` already asked the Demo vs Full gate (it does this as Step 1, the literal first question), it passes the type in as `PROJECT_TYPE=demo` or `PROJECT_TYPE=full` via env or arg — **skip the gate, do not re-ask.**
|
|
55
|
+
|
|
56
|
+
Only ask the gate yourself when invoked standalone (not via `/qualia-new`). When you do, use **AskUserQuestion** (interactive UI — never plain text):
|
|
55
57
|
|
|
56
58
|
- header: "Project shape"
|
|
57
|
-
- question: "Is this a demo (single shippable milestone, sales conversation
|
|
59
|
+
- question: "Is this a demo (single shippable milestone, sales conversation) or a full project (multi-milestone arc to Handoff)?"
|
|
58
60
|
- options: ["Demo", "Full project"]
|
|
59
61
|
|
|
60
62
|
This is the only fork. Demo runs §1-§8 of the discovery template. Full project runs all 14 questions.
|
|
@@ -46,11 +46,33 @@ Initialize a project with the **entire arc mapped from kickoff to handoff**. All
|
|
|
46
46
|
node ~/.claude/bin/qualia-ui.js banner new
|
|
47
47
|
```
|
|
48
48
|
|
|
49
|
-
|
|
49
|
+
Banner only. Do NOT ask anything yet. The next thing the user sees is the project-shape gate — not a free-text "tell me what to build". Shape gate first, content second, because the shape drives the question set.
|
|
50
50
|
|
|
51
|
-
|
|
51
|
+
### Step 1. Project Type Gate (Demo vs Full vs Quick) — FIRST QUESTION
|
|
52
52
|
|
|
53
|
-
|
|
53
|
+
The single most important fork. Demo and Full produce different journeys, different research depth, different milestone counts. This is the literal first interaction with the user.
|
|
54
|
+
|
|
55
|
+
Use **AskUserQuestion** (interactive UI — never a plain-text prompt):
|
|
56
|
+
|
|
57
|
+
- header: "Project shape"
|
|
58
|
+
- question: "What kind of project is this? Pick one — it drives everything else."
|
|
59
|
+
- options:
|
|
60
|
+
- "Demo" — one shippable milestone, real backend, no mocks. Built to win a client conversation, extensible via `/qualia-milestone` if they sign. 8-question discovery.
|
|
61
|
+
- "Full project" — the multi-milestone arc to Handoff. 2-5 milestones planned upfront. 14-question discovery.
|
|
62
|
+
- "Quick prototype" — landing page, throwaway, ≤1 day. Skips research and journey. (Equivalent to `--quick` flag.)
|
|
63
|
+
|
|
64
|
+
Store the answer as `PROJECT_TYPE=demo` | `PROJECT_TYPE=full` | `PROJECT_TYPE=quick`. It drives every downstream step.
|
|
65
|
+
|
|
66
|
+
**Demo design philosophy is non-negotiable (do NOT compromise on these regardless of speed pressure):**
|
|
67
|
+
- **1 milestone only.** No multi-milestone arc, no Handoff phase. The demo IS the artifact.
|
|
68
|
+
- **NO mock data.** Real backend, real database, real auth. Hardcoded JSON in components is a hard-block. If the data needs Supabase, ship Supabase.
|
|
69
|
+
- **Real agent/platform functionality.** The thing actually works end-to-end. A demo with broken flows is not a Qualia demo.
|
|
70
|
+
- **DESIGN.md mandatory.** OKLCH palette, distinctive typography, full token system. Slop-detect runs hard-block.
|
|
71
|
+
- **Focus = design + functionality.** No sales decks, no placeholder copy, no "lorem ipsum" anywhere.
|
|
72
|
+
|
|
73
|
+
Speed in a demo comes from skipping multi-milestone planning, NEVER from skipping design quality, mocking the backend, or cutting corners on the core flow.
|
|
74
|
+
|
|
75
|
+
### Step 2. Brownfield Check
|
|
54
76
|
|
|
55
77
|
```bash
|
|
56
78
|
test -f package.json && echo "HAS_PACKAGE"
|
|
@@ -58,23 +80,27 @@ test -d .git && echo "HAS_GIT"
|
|
|
58
80
|
test -f .planning/codebase/README.md && echo "ALREADY_MAPPED"
|
|
59
81
|
```
|
|
60
82
|
|
|
61
|
-
If existing code is detected AND not already mapped,
|
|
83
|
+
If existing code is detected AND not already mapped, **AskUserQuestion**:
|
|
62
84
|
|
|
63
|
-
|
|
85
|
+
- header: "Existing code detected"
|
|
86
|
+
- question: "Run `/qualia-map` to scan the repo first?"
|
|
87
|
+
- options: ["Yes — map it", "No — proceed without mapping"]
|
|
64
88
|
|
|
65
|
-
|
|
89
|
+
If yes, invoke the `qualia-map` skill inline, wait for completion, then continue. If quick prototype + brownfield, skip the map (quick is for greenfield trivial work; brownfield + quick is contradictory — route to `/qualia-feature` instead).
|
|
66
90
|
|
|
67
|
-
-
|
|
68
|
-
|
|
69
|
-
|
|
70
|
-
|
|
71
|
-
|
|
91
|
+
### Step 3. One-line Pitch (free-text — minimal, no clarification round)
|
|
92
|
+
|
|
93
|
+
The shape is locked, now capture the content in one sentence:
|
|
94
|
+
|
|
95
|
+
> **"What are you building? One sentence — a stranger should understand it."**
|
|
96
|
+
|
|
97
|
+
Accept whatever the user says, even if broad. **Do NOT start an ad-hoc clarification round here.** Depth comes from the structured discovery interview in Step 4, not from free-form questioning. If the answer is "a SaaS platform" — that's fine, write it down, move on. `/qualia-discuss` will refine it through its 8 or 14 structured questions.
|
|
72
98
|
|
|
73
|
-
|
|
99
|
+
This is the ONLY free-text question in the kickoff flow. Everything else is `AskUserQuestion`.
|
|
74
100
|
|
|
75
|
-
|
|
101
|
+
### Step 4. Mandatory Discovery Interview (PROJECT MODE)
|
|
76
102
|
|
|
77
|
-
|
|
103
|
+
**Hard rule:** This is the next tool call after Step 3. No ad-hoc clarification, no free-form follow-up, no "let me ask a few quick things first." If the one-line pitch was "a SaaS platform", you invoke `/qualia-discuss` NOW — that skill's structured questions are how breadth gets refined into depth.
|
|
78
104
|
|
|
79
105
|
Invoke `/qualia-discuss` inline in PROJECT MODE — non-technical kickoff interview. 8 questions for demo, 14 for full project. Pass `PROJECT_TYPE` so the discuss skill skips the type question.
|
|
80
106
|
|
|
@@ -94,11 +120,11 @@ After the interview returns, `.planning/project-discovery.md` exists with the us
|
|
|
94
120
|
|
|
95
121
|
If "More questions": re-invoke `/qualia-discuss` for additional rounds. Otherwise continue.
|
|
96
122
|
|
|
97
|
-
### Step
|
|
123
|
+
### Step 5. Detect Project Type
|
|
98
124
|
|
|
99
|
-
From questioning answers, infer type → `website` | `ai-agent` | `voice-agent` | `mobile-app` | `null`. If matched, `cat ~/.claude/qualia-templates/projects/{type}.md` gives suggested milestone arc. Store `template_type` for Step
|
|
125
|
+
From questioning answers, infer type → `website` | `ai-agent` | `voice-agent` | `mobile-app` | `null`. If matched, `cat ~/.claude/qualia-templates/projects/{type}.md` gives suggested milestone arc. Store `template_type` for Step 13.
|
|
100
126
|
|
|
101
|
-
### Step
|
|
127
|
+
### Step 6. Design Direction (frontend only)
|
|
102
128
|
|
|
103
129
|
- header: "Design"
|
|
104
130
|
- question: "What's the design vibe?"
|
|
@@ -106,7 +132,7 @@ From questioning answers, infer type → `website` | `ai-agent` | `voice-agent`
|
|
|
106
132
|
|
|
107
133
|
Plus free-text: "Any brand colors or reference sites I should look at?"
|
|
108
134
|
|
|
109
|
-
### Step
|
|
135
|
+
### Step 7. Client Context
|
|
110
136
|
|
|
111
137
|
- header: "Client"
|
|
112
138
|
- question: "Client project or internal?"
|
|
@@ -117,7 +143,7 @@ If client, ask name. Check saved prefs:
|
|
|
117
143
|
node ~/.claude/bin/knowledge.js search "{client name}"
|
|
118
144
|
```
|
|
119
145
|
|
|
120
|
-
### Step
|
|
146
|
+
### Step 8. Write PROJECT.md
|
|
121
147
|
|
|
122
148
|
Create `.planning/PROJECT.md` from the template. Include: client, what we're building, core value, validated + active requirements (empty for greenfield), out of scope, stack, design direction, decisions table.
|
|
123
149
|
|
|
@@ -127,7 +153,7 @@ git add .planning/PROJECT.md
|
|
|
127
153
|
git commit -m "docs: initialize project"
|
|
128
154
|
```
|
|
129
155
|
|
|
130
|
-
### Step
|
|
156
|
+
### Step 8a. Seed CONTEXT.md and decisions/ (v5.0 — REQUIRED)
|
|
131
157
|
|
|
132
158
|
The domain glossary is the single highest-leverage piece of substrate — every road agent loads it BEFORE PROJECT.md/DESIGN.md. Misalignment is the #1 failure mode in AI coding; CONTEXT.md kills it.
|
|
133
159
|
|
|
@@ -156,7 +182,7 @@ git commit -m "docs: seed CONTEXT.md domain glossary + decisions/ folder"
|
|
|
156
182
|
|
|
157
183
|
The glossary stays terse — one sentence per entry. It's loaded into every agent spawn; bloat costs tokens. `/qualia-discuss` will grow it inline as decisions crystallize during phase planning.
|
|
158
184
|
|
|
159
|
-
### Step
|
|
185
|
+
### Step 8b. Write PRODUCT.md (v4.5.0 — REQUIRED)
|
|
160
186
|
|
|
161
187
|
`PRODUCT.md` is the "who and why" every road agent reads before designing or building. It is **required** — the planner, builder, and verifier all load it as substrate.
|
|
162
188
|
|
|
@@ -175,7 +201,7 @@ git add .planning/PRODUCT.md
|
|
|
175
201
|
git commit -m "docs: PRODUCT.md — register, users, voice, anti-references"
|
|
176
202
|
```
|
|
177
203
|
|
|
178
|
-
### Step
|
|
204
|
+
### Step 9. Create config.json
|
|
179
205
|
|
|
180
206
|
```json
|
|
181
207
|
{
|
|
@@ -192,7 +218,7 @@ git commit -m "docs: PRODUCT.md — register, users, voice, anti-references"
|
|
|
192
218
|
|
|
193
219
|
**Note:** `workflow.research` is ALWAYS `true` for v4. It exists for telemetry but is no longer read as a gate.
|
|
194
220
|
|
|
195
|
-
### Step
|
|
221
|
+
### Step 10. Create DESIGN.md (frontend projects — v4.5.0 OKLCH-first)
|
|
196
222
|
|
|
197
223
|
If frontend work is involved, generate `.planning/DESIGN.md` from `templates/DESIGN.md`. The generation MUST commit to four things upfront (these go in §1 of DESIGN.md):
|
|
198
224
|
|
|
@@ -219,7 +245,7 @@ git add .planning/DESIGN.md .planning/config.json
|
|
|
219
245
|
git commit -m "docs: DESIGN.md — direction commit + OKLCH palette + tokens"
|
|
220
246
|
```
|
|
221
247
|
|
|
222
|
-
### Step
|
|
248
|
+
### Step 11. Run Research (ALWAYS, no permission ask)
|
|
223
249
|
|
|
224
250
|
**In v4, research runs unconditionally.** The previous `workflow.research` gate is gone. Skipping research leads to generic roadmaps and surprises late in the project — the 4-agent cost is worth it.
|
|
225
251
|
|
|
@@ -253,7 +279,7 @@ node ~/.claude/bin/qualia-ui.js ok "Research complete"
|
|
|
253
279
|
```
|
|
254
280
|
Display top 3 from SUMMARY.md (stack recommendation, table stakes, top pitfall).
|
|
255
281
|
|
|
256
|
-
### Step
|
|
282
|
+
### Step 12. Feature Scoping (Multi-Milestone)
|
|
257
283
|
|
|
258
284
|
Read `.planning/research/FEATURES.md` and present the feature landscape. Features are scoped **to milestones** — you'll decide per-feature which milestone owns it.
|
|
259
285
|
|
|
@@ -271,7 +297,7 @@ Track selections:
|
|
|
271
297
|
|
|
272
298
|
Gather any additional requirements the user wants that research missed.
|
|
273
299
|
|
|
274
|
-
### Step
|
|
300
|
+
### Step 13. Run Roadmapper
|
|
275
301
|
|
|
276
302
|
```bash
|
|
277
303
|
node ~/.claude/bin/qualia-ui.js banner roadmap
|
|
@@ -279,12 +305,12 @@ node ~/.claude/bin/qualia-ui.js banner roadmap
|
|
|
279
305
|
|
|
280
306
|
**Roadmapper output branches on `PROJECT_TYPE`:**
|
|
281
307
|
|
|
282
|
-
- **Demo** (`PROJECT_TYPE=demo`): roadmapper produces a 1-milestone JOURNEY.md (the demo milestone, 2-4 phases) plus a matching REQUIREMENTS.md and a fully-detailed ROADMAP.md. No "Handoff" milestone is appended — the demo is its own complete artifact. The journey-tree at Step
|
|
308
|
+
- **Demo** (`PROJECT_TYPE=demo`): roadmapper produces a 1-milestone JOURNEY.md (the demo milestone, 2-4 phases) plus a matching REQUIREMENTS.md and a fully-detailed ROADMAP.md. No "Handoff" milestone is appended — the demo is its own complete artifact. The journey-tree at Step 14 shows a single rung; the "extend to full project" branch is handled later by `/qualia-milestone` if the client signs.
|
|
283
309
|
- **Full project** (`PROJECT_TYPE=full`): roadmapper produces the standard 2-5 milestone arc ending in Handoff. Milestone 1 fully detailed, M2..M{N-1} sketched (unless `--full-detail`).
|
|
284
310
|
|
|
285
311
|
Spawn the roadmapper with `<project_type>$PROJECT_TYPE</project_type>` in the prompt. If the user passed `--full-detail`, include `<full_detail>true</full_detail>` so the roadmapper writes complete phase detail for ALL milestones (full project only; demo always has full detail because there's only one milestone). See REFERENCE.md section "Roadmapper prompt" for the verbatim prompt template.
|
|
286
312
|
|
|
287
|
-
### Step
|
|
313
|
+
### Step 14. Present the Journey (single view)
|
|
288
314
|
|
|
289
315
|
Render the branded journey ladder:
|
|
290
316
|
|
|
@@ -296,7 +322,7 @@ This shows M1..M{N} as a vertical ladder: shipped milestones get a green dot, cu
|
|
|
296
322
|
|
|
297
323
|
Also narrate the one-glance summary. See REFERENCE.md section "Journey ladder format" for the ASCII template.
|
|
298
324
|
|
|
299
|
-
### Step
|
|
325
|
+
### Step 15. Approval Gate (single — for the whole journey)
|
|
300
326
|
|
|
301
327
|
- header: "Journey"
|
|
302
328
|
- question: "Does this journey work for you?"
|
|
@@ -326,7 +352,7 @@ node ~/.claude/bin/qualia-ui.js info "Full phase detail for each later milestone
|
|
|
326
352
|
|
|
327
353
|
(Skip this block when `--full-detail` was used — all milestones are already fully planned in that case.)
|
|
328
354
|
|
|
329
|
-
### Step
|
|
355
|
+
### Step 16. Environment Setup
|
|
330
356
|
|
|
331
357
|
Supabase project? `supabase link` or create. Vercel project? `vercel link`. Env vars? `.env.local` with placeholders from PROJECT.md stack.
|
|
332
358
|
|
|
@@ -337,7 +363,7 @@ git add .gitignore
|
|
|
337
363
|
git commit -m "chore: environment setup" 2>/dev/null
|
|
338
364
|
```
|
|
339
365
|
|
|
340
|
-
### Step
|
|
366
|
+
### Step 17. Auto-Apply Gate (or stop here)
|
|
341
367
|
|
|
342
368
|
If invoked with `--auto`, skip straight into building Milestone 1:
|
|
343
369
|
|
|
@@ -390,15 +416,17 @@ Do NOT use `--quick` for: client projects, anything with compliance stakes, anyt
|
|
|
390
416
|
|
|
391
417
|
## Rules
|
|
392
418
|
|
|
393
|
-
1. **Project type is the first
|
|
394
|
-
2. **
|
|
395
|
-
3. **
|
|
396
|
-
4. **
|
|
397
|
-
5. **
|
|
398
|
-
6. **
|
|
399
|
-
7. **
|
|
400
|
-
8. **
|
|
401
|
-
9. **
|
|
402
|
-
10. **
|
|
403
|
-
11. **
|
|
404
|
-
12. **
|
|
419
|
+
1. **Project type is the first question, period.** Step 1 (Demo / Full / Quick) is the literal first interaction with the user — even before "what are you building". Every downstream step branches on the answer. Don't skip it, don't infer it, don't ask anything before it.
|
|
420
|
+
2. **AskUserQuestion for every discrete-choice question.** Project type, brownfield gate, design vibe, client type, approval gate, auto-chain — all use the interactive UI. The ONLY free-text question in the kickoff flow is the Step 3 one-line pitch. No plain-text prompts for anything that has a closed set of answers.
|
|
421
|
+
3. **No ad-hoc clarification questioning.** After Step 3 (one-line pitch), the next tool call is `/qualia-discuss`. No "let me ask a few quick things first", no "that's too broad, can you clarify". Depth is the discuss skill's job — not yours.
|
|
422
|
+
4. **Discovery interview is mandatory (v5.6).** Step 4 always invokes `/qualia-discuss` in PROJECT MODE. No free-form questioning loop, no "I'll just sketch PROJECT.md from the user's first message." The interview is 8 questions for demo, 14 for full project.
|
|
423
|
+
5. **Research runs automatically.** No permission ask. Only `--quick` skips it. Demo path uses `<scope>quick</scope>` (3-call budget per researcher); full project uses standard 8-call budget.
|
|
424
|
+
6. **Demo design philosophy is non-negotiable.** Real backend always (Supabase, real auth), DESIGN.md mandatory, slop-detect hard-block, 1 milestone, focus on real agent/platform functionality + design quality. No mock data, no lorem ipsum, no broken flows. Speed comes from skipping multi-milestone planning, never from skipping design quality, mocking the backend, or cutting corners on the core flow. A demo that uses mock data is not a Qualia demo.
|
|
425
|
+
7. **Demos are 1 milestone, full projects are 2-5.** Demo journeys have no "Handoff" — the demo IS the artifact. Full projects always end in Handoff (fixed 4 phases). The journey-tree adapts to both shapes.
|
|
426
|
+
8. **The full-project journey includes Handoff.** Every full project's final milestone is literally named "Handoff" with 4 standard phases. The roadmapper enforces this.
|
|
427
|
+
9. **Single approval gate.** One gate for the whole journey. Not per-milestone, not per-phase.
|
|
428
|
+
10. **Milestone 1 is fully detailed (full projects).** M2..M{N-1} are sketched. Detail fills in when each milestone opens. Demos are always fully detailed because they're 1 milestone.
|
|
429
|
+
11. **STATE.md through state.js.** Never edit STATE.md or tracking.json by hand.
|
|
430
|
+
12. **Inline skill invocation.** When Step 2 offers `/qualia-map` or Step 4 invokes `/qualia-discuss`, invoke it inline — don't exit.
|
|
431
|
+
13. **CONTEXT.md is mandatory.** Every project gets a domain glossary at `.planning/CONTEXT.md`. Seeded from the discovery interview answers. Loaded by every road agent. Kept terse.
|
|
432
|
+
14. **ADRs are scarce.** `.planning/decisions/` exists from day one but only fills with hard-to-reverse, surprising-without-context, real-tradeoff decisions. Cargo-culting ADRs ruins the signal.
|