npm - qualia-framework - Versions diffs - 4.1.1 → 4.4.0 - Mend

qualia-framework 4.1.1 → 4.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (43) hide show

package/README.md +15 -11
package/agents/builder.md +28 -0
package/agents/research-synthesizer.md +7 -0
package/bin/agent-runs.js +233 -0
package/bin/cli.js +355 -16
package/bin/install.js +87 -6
package/bin/knowledge-flush.js +164 -0
package/bin/knowledge.js +317 -0
package/bin/plan-contract.js +220 -0
package/bin/state.js +15 -9
package/docs/agent-runs.md +273 -0
package/docs/journey-demo.html +1008 -0
package/docs/plan-contract.md +321 -0
package/docs/reviews/v4.1.0-audit.html +1488 -0
package/docs/reviews/v4.1.0-audit.md +263 -0
package/hooks/auto-update.js +3 -7
package/hooks/git-guardrails.js +167 -0
package/hooks/pre-compact.js +22 -11
package/hooks/pre-deploy-gate.js +16 -2
package/hooks/pre-push.js +22 -2
package/hooks/stop-session-log.js +180 -0
package/package.json +8 -2
package/skills/qualia-build/SKILL.md +5 -5
package/skills/qualia-debug/SKILL.md +1 -1
package/skills/qualia-design/SKILL.md +15 -0
package/skills/qualia-flush/SKILL.md +200 -0
package/skills/qualia-learn/SKILL.md +47 -37
package/skills/qualia-new/SKILL.md +1 -1
package/skills/qualia-plan/SKILL.md +3 -2
package/skills/qualia-postmortem/SKILL.md +238 -0
package/skills/qualia-quick/SKILL.md +1 -1
package/skills/qualia-report/SKILL.md +1 -1
package/skills/qualia-review/SKILL.md +3 -2
package/skills/qualia-ship/SKILL.md +12 -10
package/skills/qualia-verify/SKILL.md +60 -0
package/templates/help.html +13 -7
package/templates/knowledge/agents.md +71 -0
package/templates/knowledge/index.md +47 -0
package/tests/bin.test.sh +322 -12
package/tests/hooks.test.sh +131 -20
package/tests/lib.test.sh +217 -0
package/tests/runner.js +103 -77
package/tests/state.test.sh +4 -3

package/docs/plan-contract.md ADDED Viewed

@@ -0,0 +1,321 @@
+# Plan Contract
+Machine-readable plan format consumed by builder, verifier, plan-checker, and `state.js`. Replaces ad-hoc markdown re-parsing — markdown plans become presentation, this JSON contract is truth.
+Status: **draft, v1.** Pressure-test the shape against real phases before locking.
+## Why this exists
+Today, `templates/plan.md` is structured markdown. Planner emits it, builder re-interprets it, verifier re-interprets it, plan-checker re-interprets it. Three independent LLM interpretations of the same prose = drift. The drift is invisible until verification fails for a reason that doesn't match the planner's intent.
+The contract shifts every machine-driven step (task assignment, dependency check, verification execution) onto deterministic JSON. Prose stays in `phase-N-plan.md` for humans; code reads `phase-N-contract.json`.
+## File layout
+```
+.planning/
+  phase-1-plan.md           # human-facing prose (existing)
+  phase-1-contract.json     # machine truth (NEW)
+  phase-1-deviations.json   # builder→verifier deltas (existing)
+  phase-1-verification.md   # verifier output (existing)
+```
+`contract.json` is committed. It is regenerated only by re-running `/qualia-plan` or `qualia-framework state.js compile-plan`.
+## Schema (v1)
+TypeScript-flavored for readability. Authoritative validator lives at `bin/lib/plan-contract.js` (Zod or hand-rolled — TBD; framework currently has zero deps).
+```ts
+interface PlanContract {
+  version: 1;                    // bump on breaking change
+  phase: number;                 // 1-indexed
+  goal: string;                  // 1-2 sentences, what's TRUE when done
+  why: string;                   // unlocks-what; one sentence
+  generated_at: string;          // ISO 8601 UTC
+  generated_by: "planner" | "compile-plan" | "manual";
+  source_plan_hash: string;      // sha256 of phase-N-plan.md at compile time; "" if generated_by="manual"
+  tasks: Task[];
+  success_criteria: string[];    // phase-level user-facing truths
+}
+interface Task {
+  id: string;                    // "T1", "T2" — stable across reorders
+  title: string;
+  wave: number;                  // 1-indexed; tasks in same wave run in parallel
+  depends_on: string[];          // task ids this task needs
+  persona?: PersonaTag;          // optional, for agent specialization
+  files_modify: string[];        // repo-relative paths
+  files_create: string[];        // repo-relative paths
+  files_delete: string[];        // repo-relative paths (for refactors that remove code)
+  acceptance_criteria: string[]; // observable behaviors (human-facing)
+  action: string;                // concrete builder steps (advisory prose, max 500 chars)
+  context_files: string[];       // repo-relative paths the builder should read
+  verification: VerificationCheck[];
+}
+type PersonaTag =
+  | "security" | "architect" | "ux" | "frontend"
+  | "backend" | "data" | "performance" | "none";
+type VerificationCheck =
+  | FileExistsCheck
+  | GrepMatchCheck
+  | CommandExitCheck
+  | BehavioralCheck;
+interface FileExistsCheck {
+  type: "file-exists";
+  path: string;                  // repo-relative
+  must_contain?: string;         // optional substring assertion
+}
+interface GrepMatchCheck {
+  type: "grep-match";
+  path: string;                  // file or glob
+  pattern: string;               // regex
+  expect: "present" | "absent";
+}
+interface CommandExitCheck {
+  type: "command-exit";
+  command: string;               // executed via execFile, NOT shell
+  args: string[];                // positional args (no shell parsing)
+  cwd?: string;                  // repo-relative; default = repo root
+  expected_exit: number;         // typically 0
+  timeout_ms?: number;           // default 30000
+  expect_stdout_match?: string;  // regex; optional
+}
+interface BehavioralCheck {
+  type: "behavioral";
+  description: string;           // human-readable; verifier interprets
+  evidence_required: Evidence[]; // structured citation requirements; vibes-based passes blocked at schema level
+}
+interface Evidence {
+  path: string;                  // repo-relative file path the verifier must cite
+  matcher?: string;              // optional regex the cited line must satisfy
+  description: string;           // what the cited line should demonstrate
+}
+```
+### Why these four check types
+They map 1:1 with the existing markdown Verification Contract section, so compilation is mechanical:
+| Markdown section | Maps to |
+|---|---|
+| `Check type: file-exists` | `FileExistsCheck` |
+| `Check type: grep-match` | `GrepMatchCheck` |
+| `Check type: command-exit` | `CommandExitCheck` |
+| `Check type: behavioral` | `BehavioralCheck` (last resort) |
+`behavioral` is the only check that retains LLM interpretation — and even there, the schema forces evidence-required so the verifier can't produce vibes-based passes.
+## Example: a real phase contract
+```json
+{
+  "version": 1,
+  "phase": 2,
+  "goal": "Authenticated users can sign in with email/password and reach the dashboard.",
+  "why": "Session persistence is the #1 abandonment trigger in onboarding — verification emails are wasted without it.",
+  "generated_at": "2026-04-28T14:32:00Z",
+  "generated_by": "planner",
+  "source_plan_hash": "sha256:9c1ae6f2b4d8e1f3a5c7b9d0e2f4a6c8e0b1d3f5a7c9e1b3d5f7a9c1e3b5d7f9",
+  "tasks": [
+    {
+      "id": "T1",
+      "title": "Add email/password sign-in handler",
+      "wave": 1,
+      "depends_on": [],
+      "persona": "backend",
+      "files_modify": ["src/lib/auth.ts"],
+      "files_create": ["src/lib/auth-schema.ts"],
+      "files_delete": [],
+      "acceptance_criteria": [
+        "POST /api/auth/signin returns 200 with valid creds",
+        "POST /api/auth/signin returns 401 with invalid creds",
+        "Session cookie is httpOnly and sameSite=lax"
+      ],
+      "action": "Use supabase.auth.signInWithPassword. Validate email/password with Zod schema. Set cookie via Next.js Response API.",
+      "context_files": [
+        "src/lib/supabase/server.ts",
+        "src/lib/supabase/client.ts"
+      ],
+      "verification": [
+        {
+          "type": "file-exists",
+          "path": "src/lib/auth-schema.ts",
+          "must_contain": "z.object"
+        },
+        {
+          "type": "command-exit",
+          "command": "npx",
+          "args": ["tsc", "--noEmit"],
+          "expected_exit": 0,
+          "timeout_ms": 60000
+        },
+        {
+          "type": "grep-match",
+          "path": "src/lib/auth.ts",
+          "pattern": "signInWithPassword",
+          "expect": "present"
+        }
+      ]
+    },
+    {
+      "id": "T2",
+      "title": "Wire sign-in form to handler",
+      "wave": 2,
+      "depends_on": ["T1"],
+      "persona": "frontend",
+      "files_modify": ["src/app/(auth)/signin/page.tsx"],
+      "files_create": [],
+      "files_delete": [],
+      "acceptance_criteria": [
+        "Form posts to /api/auth/signin",
+        "Error toast shows on 401",
+        "Redirect to /dashboard on 200"
+      ],
+      "action": "Add server action; show error state via useFormState; redirect via redirect() from next/navigation.",
+      "context_files": ["src/app/(auth)/signin/page.tsx"],
+      "verification": [
+        {
+          "type": "behavioral",
+          "description": "Form submission with valid creds redirects to /dashboard",
+          "evidence_required": [
+            {
+              "path": "src/app/(auth)/signin/page.tsx",
+              "matcher": "redirect\\(['\"]/dashboard",
+              "description": "redirect() call targeting /dashboard after successful signin"
+            },
+            {
+              "path": "src/app/(auth)/signin/page.tsx",
+              "matcher": "useFormState|action=",
+              "description": "form is wired to a server action or POST handler"
+            }
+          ]
+        }
+      ]
+    }
+  ],
+  "success_criteria": [
+    "User can sign in with valid credentials and land on /dashboard",
+    "User sees a clear error message on invalid credentials without leaving the page",
+    "Session persists across page reloads"
+  ]
+}
+```
+## Validation rules (enforced at emission)
+1. **`tasks[].id` must be unique** within the phase.
+2. **Task ids must match** `^T\d+$` — `T1`, `T2`, etc. The compiler prefixes markdown task numbers (`## Task 1` → `T1`).
+3. **`depends_on` must reference ids that exist** in the same contract.
+4. **No cycles in `depends_on`.**
+5. **Wave assignment must respect dependencies** — a task's wave must be `>` than the max wave of its dependencies. (Trivially: if T2 depends on T1, T2.wave > T1.wave.)
+6. **At least one verification check per task.** Empty `verification: []` is rejected.
+7. **`files_modify`, `files_create`, `files_delete` are pairwise disjoint** — a file is in at most one of the three.
+8. **`command-exit` checks must use execFile-safe args** — no shell metacharacters in `command`; `args[]` carries positional values.
+9. **`success_criteria` minimum 1 entry.**
+10. **`action` ≤ 500 characters** — enforced. Keeps planner from over-specifying implementation.
+11. **`evidence_required[].path` must be repo-relative** and `matcher` (when present) must be a valid regex.
+`bin/state.js validate-plan` runs these. Failures block transition to `built`.
+Validator implementation: hand-rolled at `bin/lib/plan-contract.js`, ~80 LOC, zero dependencies. Framework's no-deps posture is preserved.
+## Drift detection (contract vs markdown)
+Manual edits to `phase-N-plan.md` happen in practice. Without detection, the contract silently goes stale: builder reads JSON truth that no longer matches what humans see in markdown.
+`source_plan_hash` is `sha256(plan_md_contents)` at compile time, prefixed `sha256:`. Stored in the contract.
+`bin/state.js validate-plan --check-drift` re-hashes the current plan markdown and compares. Drift behavior:
+| Scenario | Action |
+|---|---|
+| Hashes match | OK, no output |
+| Hashes differ | Exit 2, message: `plan.md drifted from contract; run compile-plan --refresh` |
+| Contract missing `source_plan_hash` (legacy or `manual`) | Warn but pass — drift checking disabled for that contract |
+`compile-plan --refresh` re-reads markdown, regenerates contract, updates hash. Builder/verifier refuse to run if `--check-drift` fails.
+## Verification execution errors
+A check that *cannot run* (binary missing, timeout, cwd doesn't exist) is distinct from a check that *ran and failed*. The verifier records:
+| Outcome | `verification_result` | `failure_reason` |
+|---|---|---|
+| Check ran, passed | `pass` | — |
+| Check ran, criteria unmet | `fail` | `verification-criteria-unmet` |
+| Behavioral check, evidence missing | `fail` | `verification-evidence-missing` |
+| Check itself errored (cmd not found, timeout, etc.) | `partial` | `verification-execution-error` |
+Execution errors are NOT verification failures. They block phase advance the same way, but a postmortem treats them differently — fix the infrastructure, then re-run.
+## How builder reads it
+```js
+// pseudocode — the actual implementation lives in skills/qualia-build
+const contract = JSON.parse(fs.readFileSync(`.planning/phase-${N}-contract.json`));
+const myTask = contract.tasks.find(t => t.id === assignedTaskId);
+// builder gets:
+//   - exact files to touch
+//   - acceptance_criteria as the "definition of done"
+//   - context_files to read first
+//   - verification[] is the self-check before declaring DONE
+```
+The builder still receives the Action prose as advisory guidance. The contract is the boundary.
+## How verifier reads it
+For each task in the contract:
+1. Walk `verification[]`.
+2. For deterministic checks (`file-exists`, `grep-match`, `command-exit`): execute and record pass/fail with stdout/stderr captured. Distinguish "ran and failed" (`verification-criteria-unmet`) from "could not run" (`verification-execution-error`).
+3. For `behavioral` checks: for each `evidence_required[i]`, the verifier MUST produce a `{path, line, snippet}` citation. If `matcher` is present, the cited line must satisfy the regex. Missing evidence or matcher mismatch → automatic `verification-evidence-missing`.
+4. Aggregate per-task → per-phase pass/fail.
+5. Write `phase-N-verification.json` (machine output) alongside `phase-N-verification.md` (human output).
+This eliminates the "verifier wrote a glowing pass when half the criteria weren't actually met" failure mode — `evidence_required[]` is structured, so vibes-based passes are blocked at the schema level.
+## Compile mode (migrating in-flight projects)
+`bin/state.js compile-plan --phase N` reads `phase-N-plan.md` and emits a best-effort `phase-N-contract.json`:
+- Frontmatter → `phase`, `goal`
+- `## Task N — title` blocks → `tasks[]`
+- `**Files:**` line → `files_modify` (cannot distinguish create vs modify from prose; defaults to modify, warns)
+- `**Acceptance Criteria:**` bullets → `acceptance_criteria`
+- `### Contract for Task N` blocks → `verification[]`
+- Missing fields → `compile-plan` exits non-zero with a list of gaps
+Compile is a one-time bridge. New plans emit JSON directly from the planner agent.
+## Design decisions (locked v1)
+These were called out as open questions during draft; resolved here so implementation can proceed.
+1. **Persona enum:** dropped `data` — covered by `backend`. Six personas + `none`.
+2. **`acceptance_criteria` vs `verification[]`:** kept separate. AC is the human-facing definition of done (lands in commit messages, milestone summaries, ERP reports). `verification[]` is the mechanical execution path. The verifier never interprets AC — it executes `verification[]`. This separation is the whole point of the contract.
+3. **`action` cap:** 500 chars. Advisory only. Validator enforces.
+4. **Versioning:** in-place migration via `compile-plan --upgrade`. `version` field tells the loader which schema to apply. No filename suffixes — canonical filename stays `phase-N-contract.json`.
+5. **Wave placement:** lives on the task. The validator enforces `task.wave > max(deps wave)` so the redundancy with `depends_on` is contained. Wave is a scheduling/display hint; `depends_on` is the constraint.
+6. **`behavioral` checks:** permanent. UX feel, error message clarity, animation timing — none of these are deterministic. The escape hatch is healthy. The `evidence_required[]` field forces verifier to cite proof; vibes-based passes are blocked at the schema level.
+7. **Validator:** hand-rolled in plain Node. Framework keeps zero npm dependencies. Zod is rejected for this layer.
+## Migration plan
+1. Add schema + validator + `compile-plan` command. No callers yet.
+2. Backfill contracts for active projects via `compile-plan` — manual review of warnings.
+3. Update planner agent prompt to emit JSON alongside markdown.
+4. Update builder skill to read JSON for files/AC/verification; markdown still readable.
+5. Update verifier agent to execute `verification[]` deterministically; keep prose verification report for humans.
+6. Update plan-checker to validate JSON.
+7. After two milestones run cleanly on JSON, mark prose plan as advisory-only in docs.
+No hard cutover. Both formats coexist during migration.