qualia-framework 4.1.1 → 4.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (43) hide show
  1. package/README.md +15 -11
  2. package/agents/builder.md +28 -0
  3. package/agents/research-synthesizer.md +7 -0
  4. package/bin/agent-runs.js +233 -0
  5. package/bin/cli.js +355 -16
  6. package/bin/install.js +87 -6
  7. package/bin/knowledge-flush.js +164 -0
  8. package/bin/knowledge.js +317 -0
  9. package/bin/plan-contract.js +220 -0
  10. package/bin/state.js +15 -9
  11. package/docs/agent-runs.md +273 -0
  12. package/docs/journey-demo.html +1008 -0
  13. package/docs/plan-contract.md +321 -0
  14. package/docs/reviews/v4.1.0-audit.html +1488 -0
  15. package/docs/reviews/v4.1.0-audit.md +263 -0
  16. package/hooks/auto-update.js +3 -7
  17. package/hooks/git-guardrails.js +167 -0
  18. package/hooks/pre-compact.js +22 -11
  19. package/hooks/pre-deploy-gate.js +16 -2
  20. package/hooks/pre-push.js +22 -2
  21. package/hooks/stop-session-log.js +180 -0
  22. package/package.json +8 -2
  23. package/skills/qualia-build/SKILL.md +5 -5
  24. package/skills/qualia-debug/SKILL.md +1 -1
  25. package/skills/qualia-design/SKILL.md +15 -0
  26. package/skills/qualia-flush/SKILL.md +200 -0
  27. package/skills/qualia-learn/SKILL.md +47 -37
  28. package/skills/qualia-new/SKILL.md +1 -1
  29. package/skills/qualia-plan/SKILL.md +3 -2
  30. package/skills/qualia-postmortem/SKILL.md +238 -0
  31. package/skills/qualia-quick/SKILL.md +1 -1
  32. package/skills/qualia-report/SKILL.md +1 -1
  33. package/skills/qualia-review/SKILL.md +3 -2
  34. package/skills/qualia-ship/SKILL.md +12 -10
  35. package/skills/qualia-verify/SKILL.md +60 -0
  36. package/templates/help.html +13 -7
  37. package/templates/knowledge/agents.md +71 -0
  38. package/templates/knowledge/index.md +47 -0
  39. package/tests/bin.test.sh +322 -12
  40. package/tests/hooks.test.sh +131 -20
  41. package/tests/lib.test.sh +217 -0
  42. package/tests/runner.js +103 -77
  43. package/tests/state.test.sh +4 -3
@@ -0,0 +1,321 @@
1
+ # Plan Contract
2
+
3
+ Machine-readable plan format consumed by builder, verifier, plan-checker, and `state.js`. Replaces ad-hoc markdown re-parsing — markdown plans become presentation, this JSON contract is truth.
4
+
5
+ Status: **draft, v1.** Pressure-test the shape against real phases before locking.
6
+
7
+ ## Why this exists
8
+
9
+ Today, `templates/plan.md` is structured markdown. Planner emits it, builder re-interprets it, verifier re-interprets it, plan-checker re-interprets it. Three independent LLM interpretations of the same prose = drift. The drift is invisible until verification fails for a reason that doesn't match the planner's intent.
10
+
11
+ The contract shifts every machine-driven step (task assignment, dependency check, verification execution) onto deterministic JSON. Prose stays in `phase-N-plan.md` for humans; code reads `phase-N-contract.json`.
12
+
13
+ ## File layout
14
+
15
+ ```
16
+ .planning/
17
+ phase-1-plan.md # human-facing prose (existing)
18
+ phase-1-contract.json # machine truth (NEW)
19
+ phase-1-deviations.json # builder→verifier deltas (existing)
20
+ phase-1-verification.md # verifier output (existing)
21
+ ```
22
+
23
+ `contract.json` is committed. It is regenerated only by re-running `/qualia-plan` or `qualia-framework state.js compile-plan`.
24
+
25
+ ## Schema (v1)
26
+
27
+ TypeScript-flavored for readability. Authoritative validator lives at `bin/lib/plan-contract.js` (Zod or hand-rolled — TBD; framework currently has zero deps).
28
+
29
+ ```ts
30
+ interface PlanContract {
31
+ version: 1; // bump on breaking change
32
+ phase: number; // 1-indexed
33
+ goal: string; // 1-2 sentences, what's TRUE when done
34
+ why: string; // unlocks-what; one sentence
35
+ generated_at: string; // ISO 8601 UTC
36
+ generated_by: "planner" | "compile-plan" | "manual";
37
+ source_plan_hash: string; // sha256 of phase-N-plan.md at compile time; "" if generated_by="manual"
38
+ tasks: Task[];
39
+ success_criteria: string[]; // phase-level user-facing truths
40
+ }
41
+
42
+ interface Task {
43
+ id: string; // "T1", "T2" — stable across reorders
44
+ title: string;
45
+ wave: number; // 1-indexed; tasks in same wave run in parallel
46
+ depends_on: string[]; // task ids this task needs
47
+ persona?: PersonaTag; // optional, for agent specialization
48
+ files_modify: string[]; // repo-relative paths
49
+ files_create: string[]; // repo-relative paths
50
+ files_delete: string[]; // repo-relative paths (for refactors that remove code)
51
+ acceptance_criteria: string[]; // observable behaviors (human-facing)
52
+ action: string; // concrete builder steps (advisory prose, max 500 chars)
53
+ context_files: string[]; // repo-relative paths the builder should read
54
+ verification: VerificationCheck[];
55
+ }
56
+
57
+ type PersonaTag =
58
+ | "security" | "architect" | "ux" | "frontend"
59
+ | "backend" | "data" | "performance" | "none";
60
+
61
+ type VerificationCheck =
62
+ | FileExistsCheck
63
+ | GrepMatchCheck
64
+ | CommandExitCheck
65
+ | BehavioralCheck;
66
+
67
+ interface FileExistsCheck {
68
+ type: "file-exists";
69
+ path: string; // repo-relative
70
+ must_contain?: string; // optional substring assertion
71
+ }
72
+
73
+ interface GrepMatchCheck {
74
+ type: "grep-match";
75
+ path: string; // file or glob
76
+ pattern: string; // regex
77
+ expect: "present" | "absent";
78
+ }
79
+
80
+ interface CommandExitCheck {
81
+ type: "command-exit";
82
+ command: string; // executed via execFile, NOT shell
83
+ args: string[]; // positional args (no shell parsing)
84
+ cwd?: string; // repo-relative; default = repo root
85
+ expected_exit: number; // typically 0
86
+ timeout_ms?: number; // default 30000
87
+ expect_stdout_match?: string; // regex; optional
88
+ }
89
+
90
+ interface BehavioralCheck {
91
+ type: "behavioral";
92
+ description: string; // human-readable; verifier interprets
93
+ evidence_required: Evidence[]; // structured citation requirements; vibes-based passes blocked at schema level
94
+ }
95
+
96
+ interface Evidence {
97
+ path: string; // repo-relative file path the verifier must cite
98
+ matcher?: string; // optional regex the cited line must satisfy
99
+ description: string; // what the cited line should demonstrate
100
+ }
101
+ ```
102
+
103
+ ### Why these four check types
104
+
105
+ They map 1:1 with the existing markdown Verification Contract section, so compilation is mechanical:
106
+
107
+ | Markdown section | Maps to |
108
+ |---|---|
109
+ | `Check type: file-exists` | `FileExistsCheck` |
110
+ | `Check type: grep-match` | `GrepMatchCheck` |
111
+ | `Check type: command-exit` | `CommandExitCheck` |
112
+ | `Check type: behavioral` | `BehavioralCheck` (last resort) |
113
+
114
+ `behavioral` is the only check that retains LLM interpretation — and even there, the schema forces evidence-required so the verifier can't produce vibes-based passes.
115
+
116
+ ## Example: a real phase contract
117
+
118
+ ```json
119
+ {
120
+ "version": 1,
121
+ "phase": 2,
122
+ "goal": "Authenticated users can sign in with email/password and reach the dashboard.",
123
+ "why": "Session persistence is the #1 abandonment trigger in onboarding — verification emails are wasted without it.",
124
+ "generated_at": "2026-04-28T14:32:00Z",
125
+ "generated_by": "planner",
126
+ "source_plan_hash": "sha256:9c1ae6f2b4d8e1f3a5c7b9d0e2f4a6c8e0b1d3f5a7c9e1b3d5f7a9c1e3b5d7f9",
127
+ "tasks": [
128
+ {
129
+ "id": "T1",
130
+ "title": "Add email/password sign-in handler",
131
+ "wave": 1,
132
+ "depends_on": [],
133
+ "persona": "backend",
134
+ "files_modify": ["src/lib/auth.ts"],
135
+ "files_create": ["src/lib/auth-schema.ts"],
136
+ "files_delete": [],
137
+ "acceptance_criteria": [
138
+ "POST /api/auth/signin returns 200 with valid creds",
139
+ "POST /api/auth/signin returns 401 with invalid creds",
140
+ "Session cookie is httpOnly and sameSite=lax"
141
+ ],
142
+ "action": "Use supabase.auth.signInWithPassword. Validate email/password with Zod schema. Set cookie via Next.js Response API.",
143
+ "context_files": [
144
+ "src/lib/supabase/server.ts",
145
+ "src/lib/supabase/client.ts"
146
+ ],
147
+ "verification": [
148
+ {
149
+ "type": "file-exists",
150
+ "path": "src/lib/auth-schema.ts",
151
+ "must_contain": "z.object"
152
+ },
153
+ {
154
+ "type": "command-exit",
155
+ "command": "npx",
156
+ "args": ["tsc", "--noEmit"],
157
+ "expected_exit": 0,
158
+ "timeout_ms": 60000
159
+ },
160
+ {
161
+ "type": "grep-match",
162
+ "path": "src/lib/auth.ts",
163
+ "pattern": "signInWithPassword",
164
+ "expect": "present"
165
+ }
166
+ ]
167
+ },
168
+ {
169
+ "id": "T2",
170
+ "title": "Wire sign-in form to handler",
171
+ "wave": 2,
172
+ "depends_on": ["T1"],
173
+ "persona": "frontend",
174
+ "files_modify": ["src/app/(auth)/signin/page.tsx"],
175
+ "files_create": [],
176
+ "files_delete": [],
177
+ "acceptance_criteria": [
178
+ "Form posts to /api/auth/signin",
179
+ "Error toast shows on 401",
180
+ "Redirect to /dashboard on 200"
181
+ ],
182
+ "action": "Add server action; show error state via useFormState; redirect via redirect() from next/navigation.",
183
+ "context_files": ["src/app/(auth)/signin/page.tsx"],
184
+ "verification": [
185
+ {
186
+ "type": "behavioral",
187
+ "description": "Form submission with valid creds redirects to /dashboard",
188
+ "evidence_required": [
189
+ {
190
+ "path": "src/app/(auth)/signin/page.tsx",
191
+ "matcher": "redirect\\(['\"]/dashboard",
192
+ "description": "redirect() call targeting /dashboard after successful signin"
193
+ },
194
+ {
195
+ "path": "src/app/(auth)/signin/page.tsx",
196
+ "matcher": "useFormState|action=",
197
+ "description": "form is wired to a server action or POST handler"
198
+ }
199
+ ]
200
+ }
201
+ ]
202
+ }
203
+ ],
204
+ "success_criteria": [
205
+ "User can sign in with valid credentials and land on /dashboard",
206
+ "User sees a clear error message on invalid credentials without leaving the page",
207
+ "Session persists across page reloads"
208
+ ]
209
+ }
210
+ ```
211
+
212
+ ## Validation rules (enforced at emission)
213
+
214
+ 1. **`tasks[].id` must be unique** within the phase.
215
+ 2. **Task ids must match** `^T\d+$` — `T1`, `T2`, etc. The compiler prefixes markdown task numbers (`## Task 1` → `T1`).
216
+ 3. **`depends_on` must reference ids that exist** in the same contract.
217
+ 4. **No cycles in `depends_on`.**
218
+ 5. **Wave assignment must respect dependencies** — a task's wave must be `>` than the max wave of its dependencies. (Trivially: if T2 depends on T1, T2.wave > T1.wave.)
219
+ 6. **At least one verification check per task.** Empty `verification: []` is rejected.
220
+ 7. **`files_modify`, `files_create`, `files_delete` are pairwise disjoint** — a file is in at most one of the three.
221
+ 8. **`command-exit` checks must use execFile-safe args** — no shell metacharacters in `command`; `args[]` carries positional values.
222
+ 9. **`success_criteria` minimum 1 entry.**
223
+ 10. **`action` ≤ 500 characters** — enforced. Keeps planner from over-specifying implementation.
224
+ 11. **`evidence_required[].path` must be repo-relative** and `matcher` (when present) must be a valid regex.
225
+
226
+ `bin/state.js validate-plan` runs these. Failures block transition to `built`.
227
+
228
+ Validator implementation: hand-rolled at `bin/lib/plan-contract.js`, ~80 LOC, zero dependencies. Framework's no-deps posture is preserved.
229
+
230
+ ## Drift detection (contract vs markdown)
231
+
232
+ Manual edits to `phase-N-plan.md` happen in practice. Without detection, the contract silently goes stale: builder reads JSON truth that no longer matches what humans see in markdown.
233
+
234
+ `source_plan_hash` is `sha256(plan_md_contents)` at compile time, prefixed `sha256:`. Stored in the contract.
235
+
236
+ `bin/state.js validate-plan --check-drift` re-hashes the current plan markdown and compares. Drift behavior:
237
+
238
+ | Scenario | Action |
239
+ |---|---|
240
+ | Hashes match | OK, no output |
241
+ | Hashes differ | Exit 2, message: `plan.md drifted from contract; run compile-plan --refresh` |
242
+ | Contract missing `source_plan_hash` (legacy or `manual`) | Warn but pass — drift checking disabled for that contract |
243
+
244
+ `compile-plan --refresh` re-reads markdown, regenerates contract, updates hash. Builder/verifier refuse to run if `--check-drift` fails.
245
+
246
+ ## Verification execution errors
247
+
248
+ A check that *cannot run* (binary missing, timeout, cwd doesn't exist) is distinct from a check that *ran and failed*. The verifier records:
249
+
250
+ | Outcome | `verification_result` | `failure_reason` |
251
+ |---|---|---|
252
+ | Check ran, passed | `pass` | — |
253
+ | Check ran, criteria unmet | `fail` | `verification-criteria-unmet` |
254
+ | Behavioral check, evidence missing | `fail` | `verification-evidence-missing` |
255
+ | Check itself errored (cmd not found, timeout, etc.) | `partial` | `verification-execution-error` |
256
+
257
+ Execution errors are NOT verification failures. They block phase advance the same way, but a postmortem treats them differently — fix the infrastructure, then re-run.
258
+
259
+ ## How builder reads it
260
+
261
+ ```js
262
+ // pseudocode — the actual implementation lives in skills/qualia-build
263
+ const contract = JSON.parse(fs.readFileSync(`.planning/phase-${N}-contract.json`));
264
+ const myTask = contract.tasks.find(t => t.id === assignedTaskId);
265
+
266
+ // builder gets:
267
+ // - exact files to touch
268
+ // - acceptance_criteria as the "definition of done"
269
+ // - context_files to read first
270
+ // - verification[] is the self-check before declaring DONE
271
+ ```
272
+
273
+ The builder still receives the Action prose as advisory guidance. The contract is the boundary.
274
+
275
+ ## How verifier reads it
276
+
277
+ For each task in the contract:
278
+ 1. Walk `verification[]`.
279
+ 2. For deterministic checks (`file-exists`, `grep-match`, `command-exit`): execute and record pass/fail with stdout/stderr captured. Distinguish "ran and failed" (`verification-criteria-unmet`) from "could not run" (`verification-execution-error`).
280
+ 3. For `behavioral` checks: for each `evidence_required[i]`, the verifier MUST produce a `{path, line, snippet}` citation. If `matcher` is present, the cited line must satisfy the regex. Missing evidence or matcher mismatch → automatic `verification-evidence-missing`.
281
+ 4. Aggregate per-task → per-phase pass/fail.
282
+ 5. Write `phase-N-verification.json` (machine output) alongside `phase-N-verification.md` (human output).
283
+
284
+ This eliminates the "verifier wrote a glowing pass when half the criteria weren't actually met" failure mode — `evidence_required[]` is structured, so vibes-based passes are blocked at the schema level.
285
+
286
+ ## Compile mode (migrating in-flight projects)
287
+
288
+ `bin/state.js compile-plan --phase N` reads `phase-N-plan.md` and emits a best-effort `phase-N-contract.json`:
289
+
290
+ - Frontmatter → `phase`, `goal`
291
+ - `## Task N — title` blocks → `tasks[]`
292
+ - `**Files:**` line → `files_modify` (cannot distinguish create vs modify from prose; defaults to modify, warns)
293
+ - `**Acceptance Criteria:**` bullets → `acceptance_criteria`
294
+ - `### Contract for Task N` blocks → `verification[]`
295
+ - Missing fields → `compile-plan` exits non-zero with a list of gaps
296
+
297
+ Compile is a one-time bridge. New plans emit JSON directly from the planner agent.
298
+
299
+ ## Design decisions (locked v1)
300
+
301
+ These were called out as open questions during draft; resolved here so implementation can proceed.
302
+
303
+ 1. **Persona enum:** dropped `data` — covered by `backend`. Six personas + `none`.
304
+ 2. **`acceptance_criteria` vs `verification[]`:** kept separate. AC is the human-facing definition of done (lands in commit messages, milestone summaries, ERP reports). `verification[]` is the mechanical execution path. The verifier never interprets AC — it executes `verification[]`. This separation is the whole point of the contract.
305
+ 3. **`action` cap:** 500 chars. Advisory only. Validator enforces.
306
+ 4. **Versioning:** in-place migration via `compile-plan --upgrade`. `version` field tells the loader which schema to apply. No filename suffixes — canonical filename stays `phase-N-contract.json`.
307
+ 5. **Wave placement:** lives on the task. The validator enforces `task.wave > max(deps wave)` so the redundancy with `depends_on` is contained. Wave is a scheduling/display hint; `depends_on` is the constraint.
308
+ 6. **`behavioral` checks:** permanent. UX feel, error message clarity, animation timing — none of these are deterministic. The escape hatch is healthy. The `evidence_required[]` field forces verifier to cite proof; vibes-based passes are blocked at the schema level.
309
+ 7. **Validator:** hand-rolled in plain Node. Framework keeps zero npm dependencies. Zod is rejected for this layer.
310
+
311
+ ## Migration plan
312
+
313
+ 1. Add schema + validator + `compile-plan` command. No callers yet.
314
+ 2. Backfill contracts for active projects via `compile-plan` — manual review of warnings.
315
+ 3. Update planner agent prompt to emit JSON alongside markdown.
316
+ 4. Update builder skill to read JSON for files/AC/verification; markdown still readable.
317
+ 5. Update verifier agent to execute `verification[]` deterministically; keep prose verification report for humans.
318
+ 6. Update plan-checker to validate JSON.
319
+ 7. After two milestones run cleanly on JSON, mark prose plan as advisory-only in docs.
320
+
321
+ No hard cutover. Both formats coexist during migration.