qualia-framework 6.3.0 → 6.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (64) hide show
  1. package/AGENTS.md +8 -8
  2. package/CLAUDE.md +6 -5
  3. package/README.md +17 -39
  4. package/bin/cli.js +64 -16
  5. package/bin/command-surface.js +6 -1
  6. package/bin/install.js +26 -11
  7. package/bin/learning-candidates.js +217 -0
  8. package/bin/prune-deprecated.js +64 -0
  9. package/bin/qualia-ui.js +1 -0
  10. package/bin/runtime-manifest.js +4 -0
  11. package/bin/security-scan.js +409 -0
  12. package/bin/state.js +106 -1
  13. package/bin/status-snapshot.js +363 -0
  14. package/guide.md +18 -33
  15. package/hooks/pre-compact.js +232 -0
  16. package/package.json +8 -2
  17. package/references/archetypes/ai-agent.md +89 -0
  18. package/references/archetypes/voice-agent.md +60 -0
  19. package/references/archetypes/web-app.md +67 -0
  20. package/references/archetypes/website.md +78 -0
  21. package/rules/constitution.md +42 -0
  22. package/skills/qualia/SKILL.md +3 -1
  23. package/skills/qualia-build/SKILL.md +1 -1
  24. package/skills/qualia-discuss/SKILL.md +1 -1
  25. package/skills/qualia-doctor/SKILL.md +1 -1
  26. package/skills/qualia-feature/SKILL.md +1 -1
  27. package/skills/qualia-fix/SKILL.md +1 -1
  28. package/skills/qualia-idk/SKILL.md +245 -0
  29. package/skills/qualia-learn/SKILL.md +1 -1
  30. package/skills/qualia-map/SKILL.md +1 -1
  31. package/skills/qualia-milestone/SKILL.md +1 -1
  32. package/skills/qualia-new/SKILL.md +1 -1
  33. package/skills/qualia-optimize/SKILL.md +1 -1
  34. package/skills/qualia-plan/SKILL.md +1 -1
  35. package/skills/qualia-polish/SKILL.md +1 -1
  36. package/skills/qualia-postmortem/SKILL.md +1 -1
  37. package/skills/qualia-report/SKILL.md +1 -1
  38. package/skills/qualia-research/SKILL.md +1 -1
  39. package/skills/qualia-review/SKILL.md +1 -1
  40. package/skills/qualia-road/SKILL.md +1 -1
  41. package/skills/qualia-scope/SKILL.md +123 -0
  42. package/skills/qualia-secure/SKILL.md +105 -0
  43. package/skills/qualia-test/SKILL.md +1 -1
  44. package/skills/qualia-verify/SKILL.md +1 -1
  45. package/skills/zoho-workflow/SKILL.md +1 -1
  46. package/tests/bin.test.sh +9 -9
  47. package/tests/install-smoke.test.sh +3 -3
  48. package/tests/lib.test.sh +17 -10
  49. package/tests/published-install-smoke.test.sh +3 -3
  50. package/tests/refs.test.sh +29 -22
  51. package/tests/runner.js +3 -3
  52. package/tests/state.test.sh +38 -7
  53. package/docs/archive/CHANGELOG-pre-v4.md +0 -855
  54. package/docs/archive/v4.0.0-review.md +0 -288
  55. package/docs/ecosystem-operating-model.md +0 -121
  56. package/docs/research/2026-04-21-command-quality-deep-research.md +0 -128
  57. package/docs/research/2026-04-21-industry-best-practices.md +0 -255
  58. package/docs/research/2026-05-11-deep-research.md +0 -189
  59. package/docs/reviews/matt-pocock-skills-analysis.md +0 -300
  60. package/docs/reviews/v4.1.0-audit.html +0 -1488
  61. package/docs/reviews/v4.1.0-audit.md +0 -263
  62. package/docs/reviews/v6.2.1-revival-audit.md +0 -53
  63. package/docs/reviews/v6.2.2-memory-erp-audit.md +0 -41
  64. package/docs/reviews/v6.2.3-erp-id-guard.md +0 -15
@@ -1,300 +0,0 @@
1
- # Matt Pocock's Skills — Deep Analysis & Qualia 5.0 Proposal
2
-
3
- **Source material**
4
- - Repo: https://github.com/mattpocock/skills (46k stars, #1 on GitHub Trending the day it dropped)
5
- - Course: https://www.aihero.dev/cohorts/claude-code-for-real-engineers-2026-04
6
- - Talk: https://www.youtube.com/watch?v=kZ-zzHVUrO4
7
-
8
- **Author of this analysis:** Synthesized 2026-05-03 from a full read of Matt's CLAUDE.md, CONTEXT.md, and every SKILL.md in `engineering/`, `productivity/`, and `misc/`.
9
-
10
- ---
11
-
12
- ## Part 1 — Matt's worldview, distilled
13
-
14
- Matt is teaching one thing: **the real bottleneck on AI-coding quality is not the model, it's the alignment between human, agent, code, and domain.** Every skill exists to fix one alignment failure.
15
-
16
- ### The four pillars (from his CLAUDE.md)
17
-
18
- | Pillar | Source | What it fixes |
19
- |---|---|---|
20
- | **Alignment Before Coding** | Pragmatic Programmer — "no-one knows exactly what they want" | Misalignment is the #1 failure mode. Cure: relentless one-question grilling until every branch of the decision tree is resolved. |
21
- | **Domain Language Matters** | DDD ubiquitous language | Shared vocabulary reduces tokens, improves agent navigation, prevents "what do you mean by user?" rework. |
22
- | **Feedback Loops Drive Quality** | Pragmatic Programmer — "the rate of feedback is your speed limit" | Without a fast deterministic pass/fail signal, agents fly blind. Cure: build the loop FIRST, then debug. |
23
- | **Design Discipline** | Ousterhout — "the best modules are deep" | Daily attention to architecture or entropy wins. Cure: deepening refactors framed in domain language. |
24
-
25
- ### The structural innovations
26
-
27
- 1. **CONTEXT.md as a domain glossary** — not a project README, not a spec. A glossary of terms with **`Avoid`** lists. The agent literally reads "use *milestone*, not *sprint*" and conforms. Reduces token waste from disambiguation rounds.
28
- 2. **ADRs (Architecture Decision Records)** in `docs/adr/` — created sparingly, only for hard-to-reverse, surprising-without-context decisions with real trade-offs. Becomes the load-bearing memory.
29
- 3. **Vertical slices everywhere** — both for issues (each issue = a complete path through schema → API → UI → tests) and for TDD (one test → one impl → repeat). Anti-horizontal as a **rule**.
30
- 4. **Skill descriptions are the API surface** — "the only thing your agent sees." Discriminative descriptions beat clever prose. <1024 chars. Third-person. Always include "Use when {context}".
31
- 5. **Skills under 100 lines** — overflow goes to `REFERENCE.md` / `EXAMPLES.md` / `scripts/` (progressive disclosure).
32
- 6. **`disable-model-invocation: true`** — some skills (e.g. `zoom-out`) are user-fired-only. Prevents auto-firing on weak signals.
33
- 7. **scripts/ co-located with skills** — for deterministic operations. "Trade tokens for reliability" — the script can't hallucinate.
34
-
35
- ### The 21 skills, mapped
36
-
37
- | Bucket | Skill | One-liner | Maps to (Qualia equivalent) |
38
- |---|---|---|---|
39
- | eng | **diagnose** | Build feedback loop FIRST, then reproduce → hypothesize → instrument → fix → regression-test | `qualia-debug` (weaker — doesn't enforce loop-first) |
40
- | eng | **grill-with-docs** | Interview the user, update CONTEXT.md/ADRs inline as decisions crystallize | `qualia-discuss` (much weaker — no glossary update) |
41
- | eng | **improve-codebase-architecture** | Find shallow modules, propose deepening refactors, use domain language | `qualia-optimize` (broader but lacks Ousterhout language) |
42
- | eng | **tdd** | Vertical-slice tracer-bullet loop. ANTI horizontal slicing (writing all tests then all code) | `qualia-test` (lacks the methodology rule) |
43
- | eng | **to-prd** | Conversation → PRD → GH Issue with `needs-triage` label | `qualia-new` (stays internal; never touches GH issues) |
44
- | eng | **to-issues** | PRD → independent vertical-slice GH Issues, dependency-ordered | **NO EQUIVALENT** |
45
- | eng | **triage** | State machine: needs-triage / needs-info / ready-for-agent / ready-for-human / wontfix | **NO EQUIVALENT** (we use tracking.json status fields, not GH labels) |
46
- | eng | **zoom-out** | "Map this area using the project glossary" — 5-line skill, user-only | **NO EQUIVALENT** |
47
- | eng | **setup-matt-pocock-skills** | Detect repo state, write `docs/agents/{tracker,labels,domain}.md`, append to CLAUDE.md | `qualia-map` (broader, but doesn't externalize agent config) |
48
- | prod | **caveman** | 75% token reduction via dropped articles/fillers, fragment sentences, abbreviations, arrows | **NO EQUIVALENT** |
49
- | prod | **grill-me** | Aggressive one-question-at-a-time interview until decision tree resolves | partial overlap with `qualia-discuss` |
50
- | prod | **write-a-skill** | Meta-skill for authoring skills — frontmatter rules, file split rules | `qualia-skill-new` (similar) |
51
- | misc | **git-guardrails** | Block dangerous git commands | partial — we have hooks |
52
- | misc | **migrate-to-shoehorn** | Codemod-y tool | n/a |
53
- | misc | **scaffold-exercises** | Course-specific | n/a |
54
- | misc | **setup-pre-commit** | Husky setup | n/a |
55
-
56
- ---
57
-
58
- ## Part 2 — Where Qualia is already strong
59
-
60
- We should NOT abandon what's working:
61
-
62
- 1. **Phase/wave/task hierarchy** — Matt has nothing equivalent to `JOURNEY.md → ROADMAP.md → phase plans → tasks with verification contracts`. This is a Qualia advantage.
63
- 2. **Fresh-context subagent spawning** — task-level context isolation prevents rot. Matt mentions subagents conceptually; we do it systematically.
64
- 3. **The verifier loop with goal-backward checks** — `qualia-verify` is sharper than anything Matt ships.
65
- 4. **Design substrate (v4.5.0)** — DESIGN.md + design-laws.md + slop-detect + 8-dimension scoring. Matt has nothing on visual design.
66
- 5. **Smart router (`qualia` skill)** — state-driven next-action recommendation. Matt's flow is more linear.
67
- 6. **The 4 mandatory Handoff deliverables** — client-delivery rigor that doesn't exist in Matt's repo (he targets engineers building in their own repos, not agencies shipping to clients).
68
-
69
- The combination of Matt's alignment discipline + Qualia's execution machinery is the play.
70
-
71
- ---
72
-
73
- ## Part 3 — Qualia 5.0 proposal: 15 changes ranked by ROI
74
-
75
- ### Tier 1 — Adopt these immediately (high ROI, low effort)
76
-
77
- #### 1. **CONTEXT.md substrate (domain glossary)** — `.planning/CONTEXT.md`
78
-
79
- The single biggest miss. Every Qualia agent currently re-derives domain terms from PROJECT.md prose. Matt's pattern: a structured glossary with `Avoid` lists.
80
-
81
- **Format:**
82
- ```markdown
83
- ## Language
84
-
85
- ### Milestone
86
- A bounded slice of the journey. Always one of: Foundation, MVP, Polish, Handoff.
87
- **Avoid:** sprint, iteration, release.
88
-
89
- ### Phase
90
- A unit of work inside a milestone, 2–5 tasks, ends in a verification gate.
91
- **Avoid:** epic, story, ticket.
92
-
93
- ### Task
94
- A single commit-sized unit with one verification contract.
95
- **Avoid:** subtask, chore.
96
-
97
- ## Relationships
98
- - Project holds many Milestones
99
- - Milestone holds many Phases
100
- - Phase holds many Tasks
101
- - Task carries one Verification Contract
102
-
103
- ## Flagged ambiguities
104
- - "User" was previously used for both `auth.users` row and customer profile. Now: **AuthUser** (auth row), **Customer** (profile).
105
- ```
106
-
107
- - Generated by `/qualia-new` from the discovery questioning
108
- - Loaded **first** by every subagent (before PROJECT.md, before DESIGN.md)
109
- - Updated inline by `/qualia-discuss` and the new `/qualia-grill`
110
- - **Expected impact:** ~30% reduction in disambiguation back-and-forth, fewer wrong-assumption rework cycles
111
-
112
- #### 2. **ADR pattern** — `.planning/decisions/ADR-{N}-{slug}.md`
113
-
114
- Replace the ad-hoc `phase-{N}-context.md` with structured ADRs. Created **sparingly**: only when the decision is hard-to-reverse, surprising-without-context, AND involves real trade-offs.
115
-
116
- **Format:**
117
- ```markdown
118
- # ADR-007 — Use Supabase RLS instead of API-layer auth checks
119
- Date: 2026-05-03
120
- Phase: 2
121
- Status: Accepted
122
-
123
- ## Context
124
- {why this decision is being made now}
125
-
126
- ## Decision
127
- {what we are doing}
128
-
129
- ## Consequences
130
- {what becomes easier; what becomes harder}
131
-
132
- ## Alternatives considered
133
- {what we rejected and why}
134
- ```
135
-
136
- - Loaded by relevant phases via `@.planning/decisions/ADR-007-*.md`
137
- - Listed in CONTEXT.md "Flagged ambiguities" when they resolve a term
138
-
139
- #### 3. **`/qualia-zoom`** — 5-line user-only skill
140
-
141
- ```yaml
142
- ---
143
- name: qualia-zoom
144
- description: Zoom out and map the surrounding modules using project glossary terms. Use when you don't know an area of code well or need to orient before editing.
145
- disable-model-invocation: true
146
- ---
147
-
148
- I don't know this area well. Go up a layer of abstraction. Give me a map of all relevant modules and callers using the vocabulary in `.planning/CONTEXT.md`.
149
- ```
150
-
151
- Tiny. Used 50× a day in mature projects. Replaces ad-hoc "explain this code" requests with a glossary-anchored response.
152
-
153
- #### 4. **Caveman mode for subagent spawning** (cost reduction)
154
-
155
- Add to the prompt-cache template in `rules/grounding.md`:
156
-
157
- > **Subagent compression**: All in-flight spawned-subagent system prompts use compressed form — drop articles, fillers, pleasantries. Use abbreviations (DB, auth, fn, impl, req, res). Fragment sentences over full clauses. Arrow notation for causality. Suspend compression for security warnings, irreversible action confirmations, and multi-step sequences where brevity risks misinterpretation.
158
-
159
- The verifier and builder spawn 50+ times per project. ~25% token reduction per spawn. Real money at scale.
160
-
161
- #### 5. **Tighten skill descriptions** (the API surface)
162
-
163
- Audit all 74 skills. Each description must:
164
- - Be ≤ 1024 chars
165
- - Be in third person
166
- - Include "Use when {specific context}" with **discriminative** triggers
167
- - Have **zero overlap** with sibling skills
168
-
169
- Currently `/qualia`, `/qualia-idk`, `/qualia-resume` all overlap on "lost? what next?" triggers. Wrong-skill-firing happens. Fix it.
170
-
171
- ### Tier 2 — Adopt these next (high ROI, medium effort)
172
-
173
- #### 6. **`/qualia-grill`** — replace or augment `/qualia-discuss`
174
-
175
- Aggressive one-question-at-a-time grilling. Walks every branch of the decision tree. Updates CONTEXT.md inline.
176
-
177
- **Critical rule from Matt's grill-me:** *"For each question, provide your recommended answer."* — the agent isn't just asking, it's proposing. The user accepts, edits, or rejects. Way faster than open-ended interview.
178
-
179
- Used **before** `/qualia-plan` for high-stakes phases (auth, payments, multi-tenant, anything with regulatory weight).
180
-
181
- #### 7. **Restructure `/qualia-debug` around feedback-loop-first**
182
-
183
- Current `qualia-debug` jumps to symptom analysis. Matt's diagnose enforces **build the loop first, then everything else is mechanical**. New phase structure:
184
-
185
- 1. Build a fast deterministic agent-runnable pass/fail signal (failing test > curl > CLI invocation > headless browser > trace replay > minimal harness > property test > bisection > differential > human-in-loop bash). Pick the highest-tier feasible.
186
- 2. Reproduce against the loop
187
- 3. Generate 3–5 ranked falsifiable hypotheses BEFORE testing
188
- 4. Instrument with `[DEBUG-{tag}]` prefixes (cleanup later)
189
- 5. Change one variable at a time
190
- 6. Fix at the correct seam, write regression test there
191
- 7. Cleanup `[DEBUG-*]` markers, verify scenario, document architectural finding answering "what would have prevented this?"
192
-
193
- #### 8. **`/qualia-deepen`** — split out from `/qualia-optimize`
194
-
195
- `/qualia-optimize` is currently a kitchen sink (perf + UI + backend + alignment). Pull out architecture-deepening as its own skill that uses Ousterhout's vocabulary explicitly: **depth, locality, leverage, seam, deletion test**.
196
-
197
- Process:
198
- 1. Read `.planning/CONTEXT.md` glossary first
199
- 2. Walk codebase noting friction points: shallow modules where interface ≈ implementation, modules requiring bouncing across many files for one concept, pure functions extracted only for testability (no locality), tightly-coupled cross-seam leakage
200
- 3. Present candidates with **Files / Problem / Solution / Benefits**
201
- 4. Grilling loop: walk the design tree with the user, update CONTEXT.md/ADRs as decisions crystallize
202
-
203
- #### 9. **`/qualia-tdd`** — vertical-slice rule enforcement
204
-
205
- ```
206
- WRONG (horizontal): RED test1-5 → GREEN impl1-5
207
- RIGHT (vertical): RED→GREEN test1→impl1, test2→impl2, ...
208
- ```
209
-
210
- One test at a time. Only enough code to pass. Tests describe **behavior through public interface**, never implementation. Never refactor while red.
211
-
212
- This is the #1 missing engineering discipline in the framework. We have `/qualia-test` (generate tests) but no test-first methodology.
213
-
214
- #### 10. **Skill file structure cleanup** (progressive disclosure)
215
-
216
- Audit pass: every SKILL.md must be **<100 lines**. Overflow goes to:
217
- - `REFERENCE.md` (advanced/rare features)
218
- - `EXAMPLES.md` (worked examples)
219
- - `scripts/` (deterministic operations co-located with the skill)
220
-
221
- Move `bin/state.js` and `bin/qualia-ui.js` from `~/.claude/bin/` into `skills/qualia/scripts/` (or symlink). The script being co-located with the skill means it's discoverable when reading the skill.
222
-
223
- ### Tier 3 — Strategic bets (very high ROI, larger effort)
224
-
225
- #### 11. **`/qualia-issues`** — externalize work to GitHub
226
-
227
- Convert the current phase plan into independent vertical-slice GH issues. Each issue:
228
- - Title describing the slice
229
- - End-to-end behavior description
230
- - Acceptance criteria checklist
231
- - Blocking-dependencies field
232
- - Label: `needs-triage`
233
-
234
- This **unlocks the autonomous loop**: other agents (or other Claude sessions, or Codex, or human contributors) can pull issues from the queue. Currently Qualia only runs in the originating session.
235
-
236
- #### 12. **`/qualia-triage`** — state machine over the issue queue
237
-
238
- Pulls open issues. Applies labels: `needs-triage` / `needs-info` / `ready-for-agent` / `ready-for-human` / `wontfix`. Routes:
239
- - `ready-for-agent` → autonomous `/qualia-build` or `/qualia-quick`
240
- - `ready-for-human` → notification to Fawzi
241
- - `needs-info` → questions back to reporter
242
-
243
- Combined with `/qualia-issues`, this is the **Ralph Wiggum loop** Matt teaches: agent pulls from backlog, builds, verifies, ships, picks next. Human-in-loop is optional, not required.
244
-
245
- #### 13. **AGENTS.md emission** alongside CLAUDE.md
246
-
247
- `/qualia-new` should write **both** `CLAUDE.md` (Anthropic) and `AGENTS.md` (the cross-vendor open standard now adopted by Codex, Cursor, Continue, Devin, Aider, etc.). Same content, different file. One project, multi-agent compatible. Costs nothing, expands the ecosystem.
248
-
249
- #### 14. **`/qualia-onboard`** — adapt-to-existing-repo skill
250
-
251
- Matt's `setup-matt-pocock-skills` is the model. Companion to `/qualia-map`. Detects:
252
- - Existing issue tracker (GH, GL, Jira, local `.scratch/`)
253
- - Existing labels — maps them to canonical roles
254
- - Existing CONTEXT.md / docs/adr / glossary structure
255
-
256
- Writes `.planning/agents/{tracker.md, labels.md, domain.md}` and appends an `## Agent skills` block to existing CLAUDE.md/AGENTS.md. **Adapts Qualia to the repo's conventions** instead of forcing Qualia conventions on the repo.
257
-
258
- This is what makes the framework usable on brownfield client projects without ripping their existing process out.
259
-
260
- #### 15. **`disable-model-invocation` flag adoption**
261
-
262
- Add to skills that should never auto-fire on a weak trigger: `qualia-zoom`, `qualia-caveman`, anything destructive (`qualia-ship`, `qualia-handoff` arguably). Reduces accidental invocations.
263
-
264
- ---
265
-
266
- ## Part 4 — Sequencing recommendation
267
-
268
- ### Wave 1 (immediate, ~1 day)
269
- - 1, 2, 3, 4, 5 (CONTEXT.md, ADRs, qualia-zoom, caveman mode, description audit)
270
-
271
- ### Wave 2 (short, ~3 days)
272
- - 6, 7, 8, 9, 10 (qualia-grill, debug restructure, qualia-deepen, qualia-tdd, file structure cleanup)
273
-
274
- ### Wave 3 (strategic, ~1 week)
275
- - 11, 12 (qualia-issues, qualia-triage) — these together unlock autonomous operation
276
-
277
- ### Wave 4 (polish, ~2 days)
278
- - 13, 14, 15 (AGENTS.md, qualia-onboard, disable-model-invocation)
279
-
280
- ### Ship as v5.0
281
- - New release: "Qualia 5.0 — alignment discipline" — leans into Matt's framing
282
-
283
- ---
284
-
285
- ## Part 5 — What we should NOT copy
286
-
287
- - **Matt's flat skill list (no router)**. Our `/qualia` smart router and state machine are better for non-trivial projects.
288
- - **Matt's lack of design discipline**. Our v4.5.0 design substrate is genuinely ahead. Don't dilute it.
289
- - **Matt's per-project `.scratch/` dir as default tracker**. We need real GH issues for client work and ERP integration.
290
- - **Matt's "scaffold-exercises"** etc. — course-specific, not relevant.
291
-
292
- ---
293
-
294
- ## Part 6 — The honest assessment
295
-
296
- Matt is teaching **alignment as the primary engineering discipline**. The framework currently treats alignment as a one-shot at `/qualia-new` and assumes it holds for the rest of the project. It doesn't. Decisions drift. Terms overload. Hypotheses go untested. Feedback loops degrade.
297
-
298
- The 15 changes above are not features — they're **rituals** that keep alignment from rotting between phases. That's the x20 unlock. Not more skills. Better-disciplined skills, anchored to a domain glossary, with feedback loops built first.
299
-
300
- The single highest-ROI change is **#1 (CONTEXT.md)**. If you do nothing else from this list, do that one. Every other change compounds off it.