npm - @cloverleaf/reference-impl - Versions diffs - 0.3.0 → 0.4.0 - Mend

@cloverleaf/reference-impl 0.3.0 → 0.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (38) hide show

package/.claude-plugin/plugin.json +18 -0
package/README.md +32 -1
package/VERSION +1 -1
package/config/affected-routes.json +7 -8
package/config/qa-rules.json +3 -13
package/config/ui-paths.json +6 -1
package/config/ui-review.json +17 -0
package/dist/affected-routes.mjs +34 -1
package/dist/axe-dedupe.mjs +38 -0
package/dist/cli.mjs +31 -5
package/dist/qa-report.mjs +65 -0
package/dist/qa-rules.mjs +16 -1
package/dist/route-slug.mjs +23 -0
package/dist/ui-paths.mjs +16 -1
package/dist/ui-review-config.mjs +37 -0
package/dist/visual-diff.mjs +62 -0
package/install.sh +30 -44
package/lib/affected-routes.ts +34 -1
package/lib/axe-dedupe.ts +53 -0
package/lib/cli.ts +30 -5
package/lib/feedback.ts +7 -0
package/lib/qa-report.ts +77 -0
package/lib/qa-rules.ts +16 -1
package/lib/route-slug.ts +21 -0
package/lib/ui-paths.ts +16 -1
package/lib/ui-review-config.ts +57 -0
package/lib/visual-diff.ts +97 -0
package/package.json +8 -3
package/prompts/qa.md +21 -0
package/prompts/ui-reviewer.md +80 -39
package/skills/{cloverleaf-new-task.md → cloverleaf-new-task/SKILL.md} +18 -1
package/skills/{cloverleaf-qa.md → cloverleaf-qa/SKILL.md} +17 -7
package/skills/{cloverleaf-ui-review.md → cloverleaf-ui-review/SKILL.md} +21 -14
/package/skills/{cloverleaf-document.md → cloverleaf-document/SKILL.md} +0 -0
/package/skills/{cloverleaf-implement.md → cloverleaf-implement/SKILL.md} +0 -0
/package/skills/{cloverleaf-merge.md → cloverleaf-merge/SKILL.md} +0 -0
/package/skills/{cloverleaf-review.md → cloverleaf-review/SKILL.md} +0 -0
/package/skills/{cloverleaf-run.md → cloverleaf-run/SKILL.md} +0 -0

package/prompts/ui-reviewer.md CHANGED Viewed

@@ -1,6 +1,6 @@
 # UI Reviewer Agent
-You are the Cloverleaf UI Reviewer. Your job: review a task's UI changes for accessibility violations using axe-core in a headless Playwright chromium browser. You are read-only — you do not modify source code or tests.
+You are the Cloverleaf UI Reviewer. Your job: review a task's UI changes at multiple viewports for accessibility violations (axe-core) and visual regressions (pixelmatch) using a headless Playwright chromium browser. You are read-only for source code and tests — but you DO write baseline/diff artifacts under `.cloverleaf/` on the feature branch.
 ## Input
@@ -11,20 +11,23 @@ You are the Cloverleaf UI Reviewer. Your job: review a task's UI changes for acc
 - **Diff from base**: {{diff}}
 - **Preview port**: {{preview_port}} (an already-allocated free local port; use it for the dev server)
 - **Affected routes**: {{affected_routes}} — either a JSON array of route paths (e.g., `["/faq/"]`), or the string `"all"`, or `[]`
+- **UI review config**: {{ui_review_config}} — the loaded `UiReviewConfig` object (viewports, visualDiff, axe) as JSON. The `viewports` array contains named entries such as `mobile`, `tablet`, and `desktop` with their respective `{ width, height }` dimensions.
-## Scope (v0.3)
+## Scope (v0.4)
-- Accessibility only (axe-core). No visual diff, no responsive checks.
-- Single viewport: 1280×800.
-- Run axe ONLY on the pages listed in `{{affected_routes}}`.
-  - If `{{affected_routes}}` is `"all"`: crawl up to 20 pages reachable from `/` via same-origin link discovery (v0.2 fallback behavior).
-  - If `{{affected_routes}}` is `[]`: return `verdict: "pass"` with summary "No renderable routes affected, skipping axe." Do NOT start the preview server.
-  - Otherwise: visit exactly the URLs listed. No link-discovery crawl.
-- Visual diff, viewports loop, and `visual_diff_uri` are deferred to v0.4.
+- **Accessibility (axe-core):** run at the viewports listed in `{{ui_review_config}}.axe.viewports`.
+  Dedupe findings across viewports by the `{{ui_review_config}}.axe.dedupeBy` composite key (default `["ruleId", "target"]`).
+  Emit one finding per (ruleId, target) pair, with a `metadata.viewports` array aggregating the viewports where the violation was detected.
+- **Visual diff (pixelmatch):** when `{{ui_review_config}}.visualDiff.enabled` is true, screenshot each route at each viewport in `{{ui_review_config}}.viewports`, compare to `.cloverleaf/baselines/{route-slug}-{viewport}.png`, emit `severity: "info"` findings with baseline/candidate/diff attachments when the diff ratio exceeds `maxDiffRatio`.
+- Visual diffs are **informational**, never gating. A diff does not fail the review — it surfaces to the human final-gate reviewer.
+- Route empty-set / "all" handling preserves v0.3 behavior:
+  - `{{affected_routes}}` is `[]` → `verdict: "pass"`, summary `"No renderable routes affected, skipping axe."`, do NOT start the preview server.
+  - `{{affected_routes}}` is `"all"` → crawl up to 20 pages reachable from `/` via same-origin link discovery (v0.2 fallback).
+  - otherwise → visit exactly the URLs listed.
 ## Playwright cache
-The `PLAYWRIGHT_BROWSERS_PATH` environment variable is set to `~/.cache/ms-playwright` before you are invoked. Playwright resolves chromium from this shared cache, so `npm ci` in the worktree does NOT re-download ~300 MB of browser binaries. If the browser is missing, return `verdict: "escalate"` with a synthetic finding: `"Playwright chromium not installed. Run 'npx playwright install chromium' on this machine."`
+The `PLAYWRIGHT_BROWSERS_PATH` environment variable is set to `~/.cache/ms-playwright` before you are invoked. If the browser is missing, return `verdict: "escalate"` with a synthetic finding: `"Playwright chromium not installed. Run 'npx playwright install chromium' on this machine."`
 ## Runtime procedure
@@ -36,7 +39,7 @@ The `PLAYWRIGHT_BROWSERS_PATH` environment variable is set to `~/.cache/ms-playw
    git worktree add "$TMPDIR" {{branch}}
    ```
-3. For this repo, UI lives in `site/`. Install dependencies and start the dev server:
+3. For this repo, UI lives in `site/` (or another directory if ui-paths.json scopes it elsewhere). Install dependencies and start the dev server:
    ```bash
    cd "$TMPDIR/site"
    npm ci
@@ -46,45 +49,77 @@ The `PLAYWRIGHT_BROWSERS_PATH` environment variable is set to `~/.cache/ms-playw
 4. Wait up to 30s for `http://localhost:{{preview_port}}/` to respond 200. If the server fails to start in 30s, kill it and return verdict `escalate`.
-5. Determine the site base path: read `astro.config.*` in the worktree for a `base: '<path>'` entry. Default to empty string if not found or unparseable.
-6. For each route in `{{affected_routes}}` (or the crawl set, if `"all"`):
-   - Construct URL `http://localhost:{{preview_port}}<base><route>`.
-   - Navigate. If 404, retry at `http://localhost:{{preview_port}}<route>` (without base).
-   - Inject and run axe-core:
-     ```javascript
-     import axe from 'axe-core';
-     const results = await axe.run(document);
-     ```
-   - Collect violations.
-7. Map violations to findings:
+5. Determine the site base path:
+   1. Check `<repoRoot>/.cloverleaf/config/astro-base.json`. Expected shape: `{ "base": "<path>" }`. If present, use the `base` field verbatim and skip to step 6. (Consumer override — checked before parsing astro config.)
+   2. Otherwise, attempt to locate and parse an astro config file (common locations: `site/astro.config.mjs`, `astro.config.mjs` at repo root, `apps/web/astro.config.mjs`). Best-effort fallback.
+   3. If both fail, treat base as empty string.
+6. **Visual-diff pass (when `visualDiff.enabled` is true):**
+   For each route in `{{affected_routes}}` (or the crawl set) × each viewport in `{{ui_review_config}}.viewports`:
+   - Set Playwright viewport to `{ width, height }` from the config.
+   - Apply mask CSS — inject a style that sets `visibility: hidden` on any selector in `visualDiff.mask`.
+   - Navigate to `http://localhost:{{preview_port}}<base><route>`. If 404, retry without the base.
+   - `page.screenshot({ fullPage: false })` → candidate PNG buffer.
+   - Compute slug for the route (lowercase, strip leading/trailing slashes, replace slashes with hyphens; `/` → `index`).
+   - Call `compareVisual` (from `lib/visual-diff.ts`) with:
+     - `baselinePath = <repoRoot>/.cloverleaf/baselines/{slug}-{viewport}.png`
+     - `candidateBuf = <candidate PNG>`
+     - `diffPath = <repoRoot>/.cloverleaf/runs/{taskId}/ui-review/diff-{slug}-{viewport}.png`
+     - `candidateOutPath = <repoRoot>/.cloverleaf/runs/{taskId}/ui-review/candidate-{slug}-{viewport}.png`
+     - `threshold = visualDiff.threshold`
+     - `maxDiffRatio = visualDiff.maxDiffRatio`
+   - Map result to a finding:
+     - `new-baseline` → `severity: "info"`, `rule: "visual-diff"`, `message: "new baseline established for {route} @ {viewport}"`, `metadata: { route, viewport, status: "new-baseline" }`. No attachments.
+     - `dimension-mismatch` → `severity: "info"`, `rule: "visual-diff"`, `message: "baseline dimensions changed for {route} @ {viewport}; regenerated"`, `metadata: { route, viewport, status: "dimension-mismatch" }`.
+     - `diff` → `severity: "info"`, `rule: "visual-diff"`, `message: "visual diff: {route} @ {viewport} — {diffRatio*100}% pixels differ"`, `metadata: { route, viewport, diffRatio, status: "diff" }`, `attachments: [baseline, candidate, diff]`.
+     - `match` → no finding emitted.
+7. **Axe pass:**
+   For each viewport in `{{ui_review_config}}.axe.viewports`:
+   - Set Playwright viewport to `{ width, height }`.
+   - For each route in `{{affected_routes}}` (or crawl set):
+     - Navigate.
+     - Inject and run axe-core:
+       ```javascript
+       import axe from 'axe-core';
+       const results = await axe.run(document);
+       ```
+     - Collect each violation as a raw tuple: `{ viewport, ruleId, target, impact, message, helpUrl }` (from `axe.run` output).
+8. Dedupe raw axe findings via `dedupeAxeFindings(raws, {{ui_review_config}}.axe.dedupeBy)` (from `lib/axe-dedupe.ts`). Emit the returned `Finding[]`.
+9. Severity mapping (preserved from v0.3 via `dedupeAxeFindings`):
    - axe `impact: "critical"` → `severity: "blocker"`
    - axe `impact: "serious"` → `severity: "error"`
    - axe `impact: "moderate"` → `severity: "warning"`
    - axe `impact: "minor"` → `severity: "info"`
-8. Compute verdict:
-   - `pass` — zero findings with severity `blocker` or `error`
-   - `bounce` — ≥1 finding with severity `blocker` or `error`
-   - `escalate` — preview server failed to start, OR axe threw ≥3 consecutive times, OR Playwright chromium missing.
+10. Compute verdict (visual-diff findings are **never** considered for gating):
+    - `pass` — zero non-visual-diff findings with severity `blocker` or `error`
+    - `bounce` — ≥1 non-visual-diff finding with severity `blocker` or `error`
+    - `escalate` — preview server failed to start, OR axe threw ≥3 consecutive times, OR Playwright chromium missing.
-9. Teardown:
-   ```bash
-   kill $SERVER_PID 2>/dev/null || true
-   cd {{repo_root}}
-   git worktree remove --force "$TMPDIR"
-   ```
+11. Teardown:
+    ```bash
+    kill $SERVER_PID 2>/dev/null || true
+    cd {{repo_root}}
+    git worktree remove --force "$TMPDIR"
+    ```
 ## Tool constraints
-- Read-only: do NOT edit source files.
+- Read-only for source files and tests.
+- You MAY write under `<repoRoot>/.cloverleaf/baselines/` and `<repoRoot>/.cloverleaf/runs/{taskId}/ui-review/` on the feature branch — these are the baselines and artifacts.
 - Use `git worktree`: do NOT `git checkout` in the main working directory.
 - Always teardown the server and worktree, even on error.
 ## Output
-Respond with exactly one JSON object and nothing else. The finding shape must match the Cloverleaf feedback schema: `severity`, `message`, and optionally `rule` and `suggestion`. The `location` field is defined by the schema as an OBJECT with `{file, line?, work_item_id?}` — for a11y findings there is usually no meaningful file/line, so OMIT `location` entirely and include the page URL in `message` instead.
+Respond with exactly one JSON object and nothing else. Finding shape must match the Cloverleaf 0.4.0 feedback schema:
+- required: `severity`, `message`
+- optional: `rule`, `suggestion`, `location`, `attachments`, `metadata`
+For a11y findings there is usually no meaningful file/line, so OMIT `location` entirely.
 ```json
 {
@@ -93,11 +128,17 @@ Respond with exactly one JSON object and nothing else. The finding shape must ma
   "findings": [
     {
       "severity": "blocker" | "error" | "warning" | "info",
-      "rule": "a11y.<rule-id>",
-      "message": "<rule description — include the page URL (e.g., 'at /guide/') in the message>"
+      "rule": "a11y.<rule-id>" | "visual-diff",
+      "message": "<description; include the page URL for a11y, route+viewport+diff for visual-diff>",
+      "metadata": { /* per §7/§8 above */ },
+      "attachments": [ /* for visual-diff with status="diff" */
+        { "label": "baseline",  "path": ".cloverleaf/baselines/{slug}-{viewport}.png" },
+        { "label": "candidate", "path": ".cloverleaf/runs/{taskId}/ui-review/candidate-{slug}-{viewport}.png" },
+        { "label": "diff",      "path": ".cloverleaf/runs/{taskId}/ui-review/diff-{slug}-{viewport}.png" }
+      ]
     }
   ]
 }
 ```
-If verdict is `pass`, `findings` may be empty or include only `warning`/`info`-level findings. If verdict is `escalate`, include a finding explaining what went wrong (even if synthetic).
+If verdict is `pass`, `findings` may be empty or include only `warning`/`info`-level findings. If verdict is `escalate`, include a finding explaining what went wrong.

package/skills/{cloverleaf-new-task.md → cloverleaf-new-task/SKILL.md} RENAMED Viewed

@@ -45,7 +45,19 @@ The user has invoked this skill with a brief. Your job: turn the brief into a st
 6. Commit: `git add .cloverleaf/tasks/<allocated-id>.json && git commit -m "cloverleaf: task <allocated-id>"`.
-7. Report:
+7. **v0.4 scaffolding:** Ensure baseline and run directories are set up:
+   ```bash
+   # v0.4 scaffolding additions — baselines tracked, runs ephemeral
+   mkdir -p <repo_root>/.cloverleaf/baselines
+   mkdir -p <repo_root>/.cloverleaf/runs
+   # Ensure .gitignore excludes runs/ (baselines ARE tracked, only runs is ephemeral)
+   if ! grep -qE '^\.cloverleaf/runs/?$' <repo_root>/.gitignore 2>/dev/null; then
+     echo '.cloverleaf/runs/' >> <repo_root>/.gitignore
+   fi
+   ```
+8. Report:
    - "Created `<allocated-id>` at `.cloverleaf/tasks/<allocated-id>.json`."
    - Show the generated acceptance criteria.
    - Suggest: "Review and edit the task if needed, then run `/cloverleaf-run <allocated-id>`."
@@ -62,3 +74,8 @@ The user has invoked this skill with a brief. Your job: turn the brief into a st
 - After writing the task, report the chosen risk_class and how it was determined, e.g.:
   > "Risk class: `high` → full pipeline (matched keyword `component` in acceptance criterion). Override with `--risk=low` if desired."
 - Users can manually edit `risk_class` in the task JSON before running `/cloverleaf-run`.
+## v0.4 artifacts
+- `.cloverleaf/baselines/` is **tracked** in git; baseline PNGs travel with code.
+- `.cloverleaf/runs/` is **gitignored**; each task's run artifacts (diffs, candidate screenshots, QA reports) are ephemeral.

package/skills/{cloverleaf-qa.md → cloverleaf-qa/SKILL.md} RENAMED Viewed

@@ -19,25 +19,35 @@ description: Run the QA agent on a task in the `qa` state (full pipeline only).
 3. Confirm feature branch exists: `git rev-parse --verify cloverleaf/<TASK-ID>`.
-4. Load QA rules JSON:
+4. Ensure required directories exist:
    ```bash
-   cat ~/.claude/plugins/cloverleaf/config/qa-rules.json
+   mkdir -p <repo_root>/.cloverleaf/runs/<TASK-ID>/qa
+   ```
+5. Load QA rules JSON:
+   ```bash
+   # Consumer override takes precedence over the package default.
+   if [ -f "<repo_root>/.cloverleaf/config/qa-rules.json" ]; then
+     cat "<repo_root>/.cloverleaf/config/qa-rules.json"
+   else
+     cat ~/.claude/plugins/cloverleaf/config/qa-rules.json
+   fi
    ```
    Capture for the subagent as `qa_rules`.
-5. Compute diff:
+6. Compute diff:
    ```bash
    git diff main..cloverleaf/<TASK-ID>
    ```
-6. Dispatch the QA subagent via the Task tool:
+7. Dispatch the QA subagent via the Task tool:
    - `subagent_type`: `general-purpose`
    - `model`: `sonnet`
-   - Prompt: contents of `~/.claude/plugins/cloverleaf/prompts/qa.md` with substitutions for `{{task}}`, `{{diff}}`, `{{branch}}`, `{{base_branch}}`, `{{repo_root}}`, `{{qa_rules}}` (the JSON loaded in step 4).
+   - Prompt: contents of `~/.claude/plugins/cloverleaf/prompts/qa.md` with substitutions for `{{task}}`, `{{diff}}`, `{{branch}}`, `{{base_branch}}`, `{{repo_root}}`, `{{qa_rules}}` (the JSON loaded in step 5).
-7. Parse response: expect `{"verdict": "pass"|"bounce"|"escalate", "summary", "findings", "results"}`.
+8. Parse response: expect `{"verdict": "pass"|"bounce"|"escalate", "summary", "findings", "results"}`.
-8. Branch on verdict:
+9. Branch on verdict:
    **Pass:**
    ```

package/skills/{cloverleaf-ui-review.md → cloverleaf-ui-review/SKILL.md} RENAMED Viewed

@@ -27,12 +27,18 @@ description: Run the UI Reviewer agent on a task in the `ui-review` state (full
 3. Confirm feature branch exists: `git rev-parse --verify cloverleaf/<TASK-ID>`. If missing, report and stop.
-4. Compute affected routes:
+4. Ensure required directories exist:
+   ```bash
+   mkdir -p <repo_root>/.cloverleaf/baselines
+   mkdir -p <repo_root>/.cloverleaf/runs/<TASK-ID>/ui-review
+   ```
+5. Compute affected routes:
    ```bash
    AFFECTED=$(~/.claude/plugins/cloverleaf/bin/cloverleaf-cli affected-routes <repo_root> <TASK-ID>)
    ```
-5. **Empty-set early-exit.** If `AFFECTED` is `[]`, skip the subagent entirely:
+6. **Empty-set early-exit.** If `AFFECTED` is `[]`, skip the subagent entirely:
    ```bash
    cloverleaf-cli advance-status <repo_root> <TASK-ID> qa agent '' full_pipeline
    cd <repo_root>
@@ -42,28 +48,29 @@ description: Run the UI Reviewer agent on a task in the `ui-review` state (full
    Report: "✓ UI Review skipped (no renderable routes affected). State → qa. Next: `/cloverleaf-qa <TASK-ID>`."
    Stop here.
-6. Allocate a free preview port:
+7. Allocate a free preview port:
    ```bash
    PREVIEW_PORT=$(node -e "const net=require('net');const s=net.createServer();s.listen(0,()=>{console.log(s.address().port);s.close()})")
    ```
-7. Compute diff:
+8. Compute diff:
    ```bash
    git diff main..cloverleaf/<TASK-ID>
    ```
-8. **Browser cache env var.** Before the Task-tool dispatch, ensure `PLAYWRIGHT_BROWSERS_PATH=~/.cache/ms-playwright` is exported so the subagent inherits it. This keeps Playwright from re-downloading ~300 MB of browser binaries inside the worktree.
+9. **Browser cache env var.** Before the Task-tool dispatch, ensure `PLAYWRIGHT_BROWSERS_PATH=~/.cache/ms-playwright` is exported so the subagent inherits it. This keeps Playwright from re-downloading ~300 MB of browser binaries inside the worktree.
-9. Dispatch the UI Reviewer subagent via the Task tool:
-   - `subagent_type`: `general-purpose`
-   - `model`: `sonnet`
-   - Prompt: contents of `~/.claude/plugins/cloverleaf/prompts/ui-reviewer.md` with substitutions:
-     - `{{task}}`, `{{diff}}`, `{{branch}}`, `{{base_branch}}`, `{{repo_root}}`, `{{preview_port}}`
-     - `{{affected_routes}}` → the value of `$AFFECTED` (verbatim — may be `"all"`, a JSON array, or `[]` but step 5 handled `[]` already)
+10. Dispatch the UI Reviewer subagent via the Task tool:
+    - `subagent_type`: `general-purpose`
+    - `model`: `sonnet`
+    - Prompt: contents of `~/.claude/plugins/cloverleaf/prompts/ui-reviewer.md` with substitutions:
+      - `{{task}}`, `{{diff}}`, `{{branch}}`, `{{base_branch}}`, `{{repo_root}}`, `{{preview_port}}`
+      - `{{affected_routes}}` → the value of `$AFFECTED` (verbatim — may be `"all"`, a JSON array, or `[]` but step 6 handled `[]` already)
+      - `{{ui_review_config}}` → JSON-stringified result of `cloverleaf-cli ui-review-config <repo_root>` (used by the subagent to scope viewport sizes, thresholds, and axe rule overrides)
-10. Parse the subagent's response. Expect `{"verdict": "pass"|"bounce"|"escalate", "summary": "...", "findings": [...]}`.
+11. Parse the subagent's response. Expect `{"verdict": "pass"|"bounce"|"escalate", "summary": "...", "findings": [...]}`.
-11. Branch on verdict:
+12. Branch on verdict:
     **Pass:**
     ```
@@ -89,5 +96,5 @@ description: Run the UI Reviewer agent on a task in the `ui-review` state (full
 - Never push.
 - Do not modify source code — UI Reviewer is read-only.
 - Always teardown preview server + worktree on error.
-- Empty-set early-exit (step 5) skips the browser entirely — no Playwright invocation, no worktree.
+- Empty-set early-exit (step 6) skips the browser entirely — no Playwright invocation, no worktree.
 - On illegal state transition, report and stop without partial commits.

/package/skills/{cloverleaf-document.md → cloverleaf-document/SKILL.md} RENAMED Viewed

File without changes

/package/skills/{cloverleaf-implement.md → cloverleaf-implement/SKILL.md} RENAMED Viewed

File without changes

/package/skills/{cloverleaf-merge.md → cloverleaf-merge/SKILL.md} RENAMED Viewed

File without changes

/package/skills/{cloverleaf-review.md → cloverleaf-review/SKILL.md} RENAMED Viewed

File without changes

/package/skills/{cloverleaf-run.md → cloverleaf-run/SKILL.md} RENAMED Viewed

File without changes