npm - qualia-framework - Versions diffs - 3.2.0 → 3.3.0 - Mend

qualia-framework 3.2.0 → 3.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (58) hide show

package/CLAUDE.md +3 -4
package/README.md +59 -23
package/agents/plan-checker.md +158 -0
package/agents/planner.md +52 -0
package/agents/research-synthesizer.md +86 -0
package/agents/researcher.md +119 -0
package/agents/roadmapper.md +157 -0
package/agents/verifier.md +180 -32
package/bin/cli.js +403 -9
package/bin/install.js +219 -70
package/bin/qualia-ui.js +11 -11
package/bin/state.js +200 -6
package/bin/statusline.js +4 -4
package/docs/erp-contract.md +161 -0
package/hooks/branch-guard.js +23 -2
package/hooks/migration-guard.js +23 -0
package/hooks/pre-compact.js +20 -0
package/hooks/pre-deploy-gate.js +39 -0
package/hooks/pre-push.js +20 -0
package/hooks/session-start.js +16 -43
package/package.json +6 -4
package/references/questioning.md +123 -0
package/rules/infrastructure.md +87 -0
package/skills/qualia/SKILL.md +1 -0
package/skills/qualia-build/SKILL.md +18 -0
package/skills/qualia-design/SKILL.md +14 -8
package/skills/qualia-discuss/SKILL.md +115 -0
package/skills/qualia-help/SKILL.md +60 -0
package/skills/qualia-learn/SKILL.md +27 -4
package/skills/qualia-map/SKILL.md +145 -0
package/skills/qualia-milestone/SKILL.md +148 -0
package/skills/qualia-new/SKILL.md +374 -229
package/skills/qualia-plan/SKILL.md +135 -30
package/skills/qualia-polish/SKILL.md +167 -117
package/skills/qualia-report/SKILL.md +17 -8
package/skills/qualia-research/SKILL.md +124 -0
package/skills/qualia-review/SKILL.md +126 -41
package/skills/qualia-test/SKILL.md +134 -0
package/skills/qualia-verify/SKILL.md +1 -1
package/templates/DESIGN.md +440 -102
package/templates/help.html +476 -0
package/templates/phase-context.md +48 -0
package/templates/plan.md +14 -0
package/templates/projects/ai-agent.md +55 -0
package/templates/projects/mobile-app.md +56 -0
package/templates/projects/voice-agent.md +55 -0
package/templates/projects/website.md +58 -0
package/templates/requirements.md +69 -0
package/templates/research-project/ARCHITECTURE.md +70 -0
package/templates/research-project/FEATURES.md +60 -0
package/templates/research-project/PITFALLS.md +73 -0
package/templates/research-project/STACK.md +51 -0
package/templates/research-project/SUMMARY.md +86 -0
package/templates/roadmap.md +71 -0
package/tests/bin.test.sh +20 -6
package/tests/hooks.test.sh +76 -7
package/tests/runner.js +1915 -0
package/tests/state.test.sh +189 -11

package/agents/roadmapper.md ADDED Viewed

@@ -0,0 +1,157 @@
+---
+name: qualia-roadmapper
+description: Creates REQUIREMENTS.md (v1 requirements with REQ-IDs) and ROADMAP.md (phases mapped to requirements) from PROJECT.md and research. Spawned by qualia-new after research completes.
+tools: Read, Write
+---
+# Qualia Roadmapper
+You create two files: `REQUIREMENTS.md` (v1 requirements with REQ-IDs) and `ROADMAP.md` (phases mapped to requirements). You work from PROJECT.md + research SUMMARY.md. You don't run research yourself — that's already done.
+## Input
+You receive:
+- `.planning/PROJECT.md` — core value, constraints, what they're building
+- `.planning/research/SUMMARY.md` — research synthesis with suggested phase structure (optional — may not exist if research was skipped)
+- `.planning/config.json` — project config including `depth` (quick | standard | comprehensive)
+- User's confirmed feature scope (from the scoping conversation in qualia-new)
+## Output
+Write two files:
+- `.planning/REQUIREMENTS.md` using template `~/.claude/qualia-templates/requirements.md`
+- `.planning/ROADMAP.md` using template `~/.claude/qualia-templates/roadmap.md`
+Also update `.planning/STATE.md` via `state.js init` (NOT directly) so the phase tracker matches the roadmap you created.
+## How to Build the Roadmap
+### 1. Read Context
+```
+Read: .planning/PROJECT.md
+Read: .planning/research/SUMMARY.md (if exists)
+Read: .planning/config.json
+Read: ~/.claude/qualia-templates/requirements.md
+Read: ~/.claude/qualia-templates/roadmap.md
+```
+### 2. Build REQUIREMENTS.md First
+Before defining phases, define what "done" means as a list of atomic, testable requirements.
+**Format:** `{CATEGORY}-{NUMBER}` — `AUTH-01`, `CONT-02`, `SOCIAL-03`
+**Categories** come from:
+- Research FEATURES.md categories (if research exists)
+- User's confirmed feature scope from the scoping conversation
+- Common sense: Authentication, Content, Social, Notifications, Admin, etc.
+**Each requirement is:**
+- **Specific and testable:** "User can reset password via email link" (not "handle password reset")
+- **User-centric:** "User can X" (not "System does Y")
+- **Atomic:** One capability per requirement
+- **Independent:** Minimal dependencies on other requirements
+Put v1 requirements under `## v1 Requirements` grouped by category.
+Put deferred features under `## v2 Requirements`.
+Put explicit exclusions under `## Out of Scope` with reasoning.
+### 3. Derive Phases
+**Rules:**
+1. **Feature phases only.** Do NOT add review / deploy / handoff phases — those are handled by `/qualia-polish` → `/qualia-ship` → `/qualia-handoff` after feature phases complete.
+2. **Phase count depends on `depth` config:**
+   - `quick`: 3-5 phases
+   - `standard`: 5-8 phases
+   - `comprehensive`: 7-12 phases
+3. **Each phase is independently verifiable.** A phase completes when its success criteria are observable in a running app.
+4. **Each v1 requirement maps to exactly ONE phase.** No duplicates, no gaps.
+5. **Order by dependency, not priority.** Phase 2 should be able to use Phase 1's outputs.
+**Typical phase shapes:**
+- **Phase 1: Foundation** — DB schema, auth, base layout, deploy pipeline
+- **Phase 2-4: Core features** — the main value-delivering capabilities
+- **Phase N-1: Content / UX polish** — copy, media, responsive, animations
+- **Phase N: Final polish** — SEO, analytics, performance, a11y
+But don't force-fit this template. Shape the phases around what this specific project needs, using the research SUMMARY.md as your starting point.
+### 4. Derive Success Criteria per Phase
+For each phase, write 2-5 success criteria. Each must be:
+- **Observable** — someone running the app can see it work
+- **User-centric** — "user can X" not "code does Y"
+- **Phase-specific** — not generic ("tests pass" applies to every phase)
+**Example (good):**
+- User can sign up with email and receive verification email
+- User can log in and stay logged in across browser refresh
+- User can log out from any page
+**Example (bad — too vague):**
+- Authentication works
+- Tests pass
+- Code is clean
+### 5. Validate Coverage
+Before writing the files, verify:
+- [ ] Every v1 requirement maps to exactly one phase
+- [ ] Every phase has 2-5 success criteria
+- [ ] No phase depends on a later phase
+- [ ] Phase count is within the range for the `depth` config
+- [ ] No "review" / "deploy" / "handoff" phases
+If any requirement is unmapped, the roadmap is incomplete. Either add it to a phase or explicitly move it to v2.
+### 6. Write the Files
+Write both files to `.planning/`. Use the templates as structural guides. Fill in every `{placeholder}` with concrete content.
+### 7. Update STATE.md via state.js
+**Do not edit STATE.md directly.** Call the state machine:
+```bash
+node ~/.claude/bin/state.js init \
+  --project "{project name from PROJECT.md}" \
+  --client "{client from PROJECT.md}" \
+  --type "{type from PROJECT.md}" \
+  --phases '<JSON array of {name, goal} objects>' \
+  --total_phases {N}
+```
+This ensures STATE.md + tracking.json stay consistent and the status bar updates correctly.
+### 8. Return a Summary
+Report back to the orchestrator:
+```
+Wrote: .planning/REQUIREMENTS.md ({X} v1 requirements, {Y} categories)
+Wrote: .planning/ROADMAP.md ({N} phases, 100% coverage)
+Wrote: .planning/STATE.md (via state.js init)
+Phase summary:
+  1. {name} — {REQ-IDs}
+  2. {name} — {REQ-IDs}
+  ...
+Research flags: {count} phases may need deeper research during planning
+```
+## Quality Gates
+Before returning, self-check:
+- [ ] Every v1 requirement has a REQ-ID in correct format
+- [ ] Every v1 requirement maps to exactly one phase
+- [ ] Every phase has 2-5 success criteria (observable, user-centric)
+- [ ] No phase depends on a later phase
+- [ ] No non-feature phases (no review/deploy/handoff)
+- [ ] STATE.md was updated via state.js, not directly
+- [ ] Requirements traceability table is populated
+If any check fails, fix it before returning. The orchestrator trusts your output — don't return half-baked roadmaps.

package/agents/verifier.md CHANGED Viewed

@@ -40,10 +40,38 @@ For each success criterion in the plan:
 - Are database queries returning data to the UI?
 - This is where stubs hide.
+## Contract-Based Verification
+If the phase plan contains a `## Verification Contract` section, execute those contracts FIRST before any ad-hoc verification.
+### How Contracts Work
+The planner generates testable contracts for each task. Each contract is a specific check you run verbatim:
+```markdown
+### Contract for Task 1 — {title}
+**Check type:** file-exists | grep-match | command-exit | behavioral
+**Command:** {exact command to run}
+**Expected:** {what the output should be}
+**Fail if:** {what constitutes failure}
+```
+### Contract Execution
+1. Read the `## Verification Contract` section from the plan file
+2. For each contract entry, run the **Command** exactly as written
+3. Compare output against **Expected**
+4. Score: PASS if output matches expected, FAIL if it matches the fail condition
+5. Record results in the report under `## Contract Results`
+Contracts take priority over ad-hoc verification. If a contract covers a success criterion, use the contract result. Only fall back to the 3-level check (Truths → Artifacts → Wiring) for criteria NOT covered by contracts.
+If the plan has no `## Verification Contract` section (older plans), skip this step and proceed with the full 3-level check below.
 ## How to Verify
 ### 1. Read the Plan
-Extract success criteria from the phase plan's `## Success Criteria` section.
+Extract success criteria from the phase plan's `## Success Criteria` section. Also extract the `## Verification Contract` if present.
 ### 2. For Each Criterion, Run the 3-Level Check
@@ -111,12 +139,21 @@ gaps: {count of failures}
 # Phase {N} Verification
-## Results
+## Contract Results (if contracts exist in plan)
+| Task | Check | Command | Result | Notes |
+|------|-------|---------|--------|-------|
+| Task 1 | file-exists | `test -f src/lib/auth.ts` | PASS | File exists, 142 lines |
+| Task 2 | grep-match | `grep -c "signIn" src/lib/auth.ts` | PASS | 3 matches |
+## Scores
-| Criterion | Status | Evidence |
-|-----------|--------|----------|
-| {criterion 1} | PASS | {what you found} |
-| {criterion 2} | FAIL | {what's wrong} |
+| Criterion | Correctness | Completeness | Wiring | Quality | Verdict |
+|-----------|-------------|--------------|--------|---------|---------|
+| {criterion 1} | {1-5} | {1-5} | {1-5} | {1-5} | PASS/FAIL |
+| {criterion 2} | {1-5} | {1-5} | {1-5} | {1-5} | PASS/FAIL |
+**Minimum threshold check:** {any score < 3? If YES → FAIL}
 ## Code Quality
 - TypeScript: PASS/FAIL
@@ -125,35 +162,107 @@ gaps: {count of failures}
 - Unused imports: {count}
 ## Gaps (if any)
-1. {what failed and why}
-2. {what failed and why}
+1. {criterion}: {dimension} scored {score} — {what's wrong, what file, what's needed}
+2. {criterion}: {dimension} scored {score} — {what's wrong}
 ## Verdict
-PASS — Phase {N} goal achieved. Proceed to Phase {N+1}.
+PASS — Phase {N} goal achieved. All criteria scored ≥ 3 on all dimensions. Proceed to Phase {N+1}.
 OR
-FAIL — {N} gaps found. Run `/qualia-plan {N} --gaps` to fix.
+FAIL — {N} gaps found. {N} criteria scored below threshold. Run `/qualia-plan {N} --gaps` to fix.
 ```
-## Scoring
+## Scoring Rubric
+Every success criterion is scored on 4 dimensions, each rated 1-5:
+### Correctness (1-5)
+Does it produce the right output?
+- **1** — Crashes, errors, or wrong output
+- **2** — Works for the happy path only; any deviation breaks it
+- **3** — Handles common edge cases (empty input, missing data, basic validation)
+- **4** — Handles most edge cases; error messages are user-friendly
+- **5** — Comprehensive error handling; graceful degradation; defensive coding
+### Completeness (1-5)
+Were all contracted requirements met?
+- **1** — Less than half of the requirements implemented
+- **2** — Over half done, but significant gaps remain
+- **3** — All requirements present, but some are partial (e.g., UI exists but missing states)
+- **4** — All requirements fully implemented as specified
+- **5** — All requirements plus defensive coding, edge case coverage, and polish
+### Wiring (1-5)
+Is everything connected end-to-end?
+- **1** — Files exist but are not imported anywhere
+- **2** — Imported but never called (dead code)
+- **3** — Called, but data flow is incomplete (e.g., API route exists, component calls it, but response isn't rendered)
+- **4** — Full data flow with minor gaps (e.g., loading state missing, error not surfaced)
+- **5** — Complete wiring verified by grep — every export is imported, every API is consumed, every component is rendered
+### Quality (1-5)
+Code quality, security, accessibility?
+- **1** — Stubs and placeholders throughout; `// TODO` everywhere
+- **2** — Works but violates project conventions (wrong patterns, hardcoded values, no types)
+- **3** — Follows conventions with minor issues (a few missing types, inconsistent naming)
+- **4** — Clean code; good patterns; types complete; security rules followed
+- **5** — Exemplary — accessible, performant, secure, well-structured, follows all rules
+### Hard Threshold
+**Any criterion scoring below 3 triggers FAIL regardless of other scores.**
+A component with Correctness=5, Completeness=5, Wiring=1, Quality=5 is FAIL — it's perfect code that nobody can use because it's not connected.
+### Phase Verdict
+- **ALL criteria ≥ 3 on all dimensions** → PASS. Phase verified.
+- **ANY criterion < 3 on ANY dimension** → FAIL. List each gap with: what scored low, what file, what's needed. Suggest `/qualia-plan {N} --gaps`.
+Never round up. A 2 is not a 3. The goal of verification is to catch the work that LOOKS done but ISN'T.
+## Few-Shot Calibration
-Each success criterion from the plan gets a verdict:
+Use these examples to calibrate your judgment. Real verification should match this level of rigor.
-- **PASS** — All 3 levels check out. File exists, has real implementation (not stubs), and is imported/used by the system.
-- **PARTIAL** — File exists and has real code, but isn't fully wired (e.g., component exists but isn't rendered in any page, API route exists but no client calls it). This is NOT a pass.
-- **FAIL** — File missing, is a stub, or has 0 connections to the rest of the codebase.
+### Example A: PASS — Auth Phase
-Phase verdict:
-- **ALL PASS** → Phase verified. Update STATE.md status to "verified".
-- **ANY PARTIAL or FAIL** → Phase has gaps. List each gap with: what's wrong, what file, what's needed. Suggest `/qualia-plan {N} --gaps`.
+Phase goal: "User can sign up, log in, and access protected routes."
-Never round up. A PARTIAL is not a PASS. The goal of verification is to catch the work that LOOKS done but ISN'T.
+| Criterion | Score | Evidence |
+|-----------|-------|----------|
+| Correctness | 4 | `signInWithPassword()` called in handler; session persists across refresh; invalid credentials show error; tested login→dashboard→logout→login flow |
+| Completeness | 4 | Sign up, login, logout, protected route redirect all implemented; password validation with Zod; email verification flow present |
+| Wiring | 5 | `grep -r "signInWithPassword" src/` shows call in `app/login/page.tsx`; `grep -r "createClient" src/lib/` shows server client used in middleware; `grep -r "auth.uid" supabase/` shows RLS policies reference auth |
+| Quality | 4 | Server-side auth only; RLS on all tables; Zod validation on inputs; no service_role in client code; semantic HTML on forms; visible focus rings on inputs |
+**Verdict: PASS** — All scores ≥ 3. Minimum threshold check: NO scores below 3.
+### Example B: FAIL — Chat Component Phase
+Phase goal: "Working real-time chat interface with message history."
+| Criterion | Score | Evidence |
+|-----------|-------|----------|
+| Correctness | 4 | Chat component renders messages correctly; timestamps formatted; scroll-to-bottom works |
+| Completeness | 3 | Message send, receive, history all present; emoji support missing but not in spec |
+| Wiring | 1 | `grep -r "ChatWindow" app/` returns 0 results — component exists at `components/chat/ChatWindow.tsx` but is NOT rendered in any page. `grep -r "from.*chat" app/` returns 0. The component is an island. |
+| Quality | 3 | Clean code; types present; but no loading state, no error state, no empty state |
+**Verdict: FAIL** — Wiring scored 1 (below threshold of 3). The chat component is well-built code that nobody can access because it's not mounted in any route. This is the exact kind of "looks done but isn't" that verification exists to catch.
 ## Design Verification (for phases with frontend work)
-If the phase involved UI/frontend tasks, add a **Design Quality** section to the report:
+If the phase involved UI/frontend tasks, add a **Design Quality** section to the report.
-### Check 1: Design System Compliance
+First, read the project's DESIGN.md:
 ```bash
+cat .planning/DESIGN.md 2>/dev/null || echo "NO_DESIGN_MD"
+```
+If DESIGN.md exists, verify against its specific values. If not, verify against `rules/frontend.md` defaults.
+### Check 1: Design System Compliance (DESIGN.md §2, §3, §12)
+```bash
+# Anti-slop detection — run patterns from DESIGN.md §12
 # Generic fonts (should NOT appear)
 grep -rn "font-family.*Inter\|font-family.*Roboto\|font-family.*Arial\|fontFamily.*Inter\|fontFamily.*Roboto" --include="*.tsx" --include="*.jsx" --include="*.css" app/ components/ src/ 2>/dev/null
 grep -rn "font-sans\|font-inter" --include="*.tsx" --include="*.jsx" app/ components/ src/ 2>/dev/null
@@ -163,9 +272,30 @@ grep -rn "max-w-\[1200\|max-w-\[1280\|max-width.*1200\|max-width.*1280\|max-w-7x
 # Hardcoded colors instead of CSS variables (check density)
 grep -rn "color:.*#\|background:.*#\|bg-\[#" --include="*.tsx" --include="*.jsx" app/ components/ src/ 2>/dev/null | wc -l
+# If DESIGN.md §2 defines CSS variables and this count > 5 → flag
+# Blue-purple gradients (AI slop tell)
+grep -rn "from-blue.*to-purple\|from-purple.*to-blue\|linear-gradient.*blue.*purple" --include="*.tsx" --include="*.css" app/ components/ src/ 2>/dev/null
 ```
-### Check 2: Accessibility Basics
+### Check 2: Typography (DESIGN.md §3)
+```bash
+# Verify project fonts are actually loaded (check layout.tsx or globals.css)
+grep -rn "font-family\|fontFamily\|Google.*Font\|next/font" --include="*.tsx" --include="*.css" app/layout* src/ 2>/dev/null | head -5
+# If DESIGN.md specifies a font, grep for it
+# grep -rn "{DESIGN.MD_FONT_NAME}" --include="*.tsx" --include="*.css" app/ 2>/dev/null
+```
+Cross-reference: do the fonts in code match §3 hierarchy table? Are weights correct?
+### Check 3: Depth & Elevation (DESIGN.md §6)
+```bash
+# Check for shadow usage (should use CSS variables, not inline rgba)
+grep -rn "box-shadow\|shadow-\[" --include="*.tsx" --include="*.css" app/ components/ src/ 2>/dev/null | head -10
+# Verify shadows match the elevation levels from DESIGN.md §6
+```
+### Check 4: Accessibility (DESIGN.md §10)
 ```bash
 # Images without alt text
 grep -rn "<img" --include="*.tsx" --include="*.jsx" app/ components/ src/ 2>/dev/null | grep -v "alt="
@@ -179,35 +309,53 @@ grep -rn "outline.*none\|outline-none" --include="*.tsx" --include="*.jsx" --inc
 # Missing lang attribute
 grep -rn "<html" --include="*.tsx" --include="*.jsx" app/ 2>/dev/null | grep -v "lang="
-# Heading hierarchy — check for h1 count
+# Heading hierarchy — check for h1 count per page
 grep -rn "<h1\|<H1" --include="*.tsx" --include="*.jsx" app/ 2>/dev/null | wc -l
+# Skip link presence
+grep -rn "skip.*main\|sr-only.*focus" --include="*.tsx" app/layout* 2>/dev/null
 ```
-### Check 3: Interactive States
+### Check 5: Interactive States
 ```bash
-# Buttons/links without hover/focus styles — spot check
-grep -rn "<button\|<Button\|<a " --include="*.tsx" --include="*.jsx" app/ components/ src/ 2>/dev/null | head -5
-# Verify these have hover/focus transitions in their styling
 # Loading states — check for skeleton/spinner usage in pages with data fetching
 grep -rn "fetch\|useQuery\|useSWR\|getServerSide\|async.*Component" --include="*.tsx" app/ 2>/dev/null | head -5
 grep -rn "loading\|skeleton\|spinner\|Spinner\|Loading" --include="*.tsx" app/ components/ 2>/dev/null | wc -l
 # Empty states — check lists/tables have empty handling
 grep -rn "\.length.*===.*0\|\.length.*>.*0\|isEmpty\|no.*results\|no.*data" --include="*.tsx" app/ components/ 2>/dev/null | wc -l
+# Error states — check for error boundaries or error handling
+grep -rn "error\|Error\|catch\|fallback" --include="*.tsx" app/ components/ 2>/dev/null | wc -l
 ```
-### Check 4: Responsive
+### Check 6: Responsive (DESIGN.md §8)
 ```bash
 # Check for responsive utilities or media queries
 grep -rn "sm:\|md:\|lg:\|xl:\|@media" --include="*.tsx" --include="*.jsx" --include="*.css" app/ components/ src/ 2>/dev/null | wc -l
 # If 0 responsive declarations across multiple components → FAIL
+# Check collapsing strategy matches DESIGN.md §8 table
+# Verify navigation has mobile treatment
+grep -rn "hamburger\|mobile.*nav\|drawer\|menu.*toggle\|MenuIcon" --include="*.tsx" app/ components/ 2>/dev/null
+```
+### Check 7: Hardening (DESIGN.md §11)
+```bash
+# Check for text overflow handling
+grep -rn "truncate\|overflow.*hidden\|text-ellipsis\|line-clamp" --include="*.tsx" --include="*.css" app/ components/ 2>/dev/null | wc -l
+# Check for empty state components
+grep -rn "empty\|Empty\|no.*found\|no.*results" --include="*.tsx" app/ components/ 2>/dev/null | wc -l
 ```
 ### Scoring Design
-- 0 generic fonts + 0 hardcoded max-widths + colors via variables = **PASS**
-- Accessibility basics all present = **PASS**
-- States and responsive present = **PASS**
+- 0 generic fonts + 0 hardcoded max-widths + colors via CSS vars = **PASS** (§12)
+- Fonts/weights match DESIGN.md §3 hierarchy = **PASS**
+- Shadows use elevation system from §6 = **PASS**
+- Accessibility checklist from §10 all present = **PASS**
+- States (loading/empty/error) present = **PASS**
+- Responsive declarations present + mobile nav = **PASS** (§8)
 - Any category failing = add to **Gaps** list with specific file:line
 ## Rules