ridgeline 0.6.0 → 0.7.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +8 -5
- package/dist/agents/core/designer.md +131 -0
- package/dist/agents/core/refiner.md +28 -1
- package/dist/agents/core/researcher.md +30 -11
- package/dist/agents/core/specifier.md +16 -0
- package/dist/agents/researchers/gaps.md +67 -0
- package/dist/agents/specifiers/visual-coherence.md +55 -0
- package/dist/cli.js +39 -8
- package/dist/cli.js.map +1 -1
- package/dist/commands/create.js +16 -1
- package/dist/commands/create.js.map +1 -1
- package/dist/commands/design.d.ts +8 -0
- package/dist/commands/design.js +130 -0
- package/dist/commands/design.js.map +1 -0
- package/dist/commands/index.d.ts +1 -0
- package/dist/commands/index.js +3 -1
- package/dist/commands/index.js.map +1 -1
- package/dist/commands/plan.js +3 -3
- package/dist/commands/plan.js.map +1 -1
- package/dist/commands/qa-workflow.d.ts +33 -0
- package/dist/commands/qa-workflow.js +139 -0
- package/dist/commands/qa-workflow.js.map +1 -0
- package/dist/commands/refine.d.ts +1 -0
- package/dist/commands/refine.js +17 -4
- package/dist/commands/refine.js.map +1 -1
- package/dist/commands/research.js +22 -8
- package/dist/commands/research.js.map +1 -1
- package/dist/commands/rewind.js +2 -2
- package/dist/commands/rewind.js.map +1 -1
- package/dist/commands/shape.js +36 -121
- package/dist/commands/shape.js.map +1 -1
- package/dist/commands/spec.js +1 -0
- package/dist/commands/spec.js.map +1 -1
- package/dist/engine/claude/stream.display.js +0 -1
- package/dist/engine/claude/stream.display.js.map +1 -1
- package/dist/engine/claude/stream.parse.d.ts +1 -15
- package/dist/engine/claude/stream.parse.js +3 -21
- package/dist/engine/claude/stream.parse.js.map +1 -1
- package/dist/engine/claude/stream.result.js +2 -2
- package/dist/engine/claude/stream.types.d.ts +15 -0
- package/dist/engine/claude/stream.types.js +23 -0
- package/dist/engine/claude/stream.types.js.map +1 -0
- package/dist/engine/discovery/agent.registry.d.ts +4 -0
- package/dist/engine/discovery/agent.registry.js +46 -18
- package/dist/engine/discovery/agent.registry.js.map +1 -1
- package/dist/engine/discovery/flavour.config.d.ts +9 -0
- package/dist/engine/discovery/flavour.config.js +61 -0
- package/dist/engine/discovery/flavour.config.js.map +1 -0
- package/dist/engine/discovery/plugin.scan.d.ts +1 -0
- package/dist/engine/discovery/plugin.scan.js +29 -1
- package/dist/engine/discovery/plugin.scan.js.map +1 -1
- package/dist/engine/discovery/skill.check.d.ts +19 -0
- package/dist/engine/discovery/skill.check.js +145 -0
- package/dist/engine/discovery/skill.check.js.map +1 -0
- package/dist/engine/pipeline/build.exec.js +1 -0
- package/dist/engine/pipeline/build.exec.js.map +1 -1
- package/dist/engine/pipeline/phase.sequence.js +10 -10
- package/dist/engine/pipeline/phase.sequence.js.map +1 -1
- package/dist/engine/pipeline/pipeline.shared.d.ts +6 -0
- package/dist/engine/pipeline/pipeline.shared.js +24 -1
- package/dist/engine/pipeline/pipeline.shared.js.map +1 -1
- package/dist/engine/pipeline/plan.exec.js +1 -0
- package/dist/engine/pipeline/plan.exec.js.map +1 -1
- package/dist/engine/pipeline/refine.exec.d.ts +2 -0
- package/dist/engine/pipeline/refine.exec.js +13 -2
- package/dist/engine/pipeline/refine.exec.js.map +1 -1
- package/dist/engine/pipeline/research.exec.d.ts +3 -0
- package/dist/engine/pipeline/research.exec.js +74 -5
- package/dist/engine/pipeline/research.exec.js.map +1 -1
- package/dist/engine/pipeline/review.exec.js +23 -0
- package/dist/engine/pipeline/review.exec.js.map +1 -1
- package/dist/engine/pipeline/specify.exec.d.ts +1 -0
- package/dist/engine/pipeline/specify.exec.js +114 -44
- package/dist/engine/pipeline/specify.exec.js.map +1 -1
- package/dist/flavours/data-analysis/core/refiner.md +28 -1
- package/dist/flavours/data-analysis/core/researcher.md +30 -11
- package/dist/flavours/data-analysis/researchers/gaps.md +59 -0
- package/dist/flavours/game-dev/core/refiner.md +28 -1
- package/dist/flavours/game-dev/core/researcher.md +30 -11
- package/dist/flavours/game-dev/researchers/gaps.md +59 -0
- package/dist/flavours/legal-drafting/core/refiner.md +28 -1
- package/dist/flavours/legal-drafting/core/researcher.md +30 -11
- package/dist/flavours/legal-drafting/researchers/gaps.md +59 -0
- package/dist/flavours/machine-learning/core/refiner.md +28 -1
- package/dist/flavours/machine-learning/core/researcher.md +30 -11
- package/dist/flavours/machine-learning/researchers/gaps.md +59 -0
- package/dist/flavours/mobile-app/core/refiner.md +28 -1
- package/dist/flavours/mobile-app/core/researcher.md +30 -11
- package/dist/flavours/mobile-app/researchers/gaps.md +59 -0
- package/dist/flavours/music-composition/core/refiner.md +28 -1
- package/dist/flavours/music-composition/core/researcher.md +30 -11
- package/dist/flavours/music-composition/researchers/gaps.md +59 -0
- package/dist/flavours/novel-writing/core/refiner.md +28 -1
- package/dist/flavours/novel-writing/core/researcher.md +30 -11
- package/dist/flavours/novel-writing/researchers/gaps.md +59 -0
- package/dist/flavours/screenwriting/core/refiner.md +28 -1
- package/dist/flavours/screenwriting/core/researcher.md +30 -11
- package/dist/flavours/screenwriting/researchers/gaps.md +59 -0
- package/dist/flavours/security-audit/core/refiner.md +28 -1
- package/dist/flavours/security-audit/core/researcher.md +30 -11
- package/dist/flavours/security-audit/researchers/gaps.md +59 -0
- package/dist/flavours/software-engineering/core/builder.md +2 -0
- package/dist/flavours/software-engineering/core/refiner.md +28 -1
- package/dist/flavours/software-engineering/core/researcher.md +30 -11
- package/dist/flavours/software-engineering/core/reviewer.md +2 -0
- package/dist/flavours/software-engineering/flavour.json +7 -0
- package/dist/flavours/software-engineering/researchers/gaps.md +59 -0
- package/dist/flavours/technical-writing/core/refiner.md +28 -1
- package/dist/flavours/technical-writing/core/researcher.md +30 -11
- package/dist/flavours/technical-writing/researchers/gaps.md +59 -0
- package/dist/flavours/test-suite/core/refiner.md +28 -1
- package/dist/flavours/test-suite/core/researcher.md +30 -11
- package/dist/flavours/test-suite/researchers/gaps.md +59 -0
- package/dist/flavours/translation/core/refiner.md +28 -1
- package/dist/flavours/translation/core/researcher.md +30 -11
- package/dist/flavours/translation/researchers/gaps.md +59 -0
- package/dist/flavours/web-game/core/builder.md +123 -0
- package/dist/flavours/web-game/core/reviewer.md +159 -0
- package/dist/flavours/web-game/flavour.json +9 -0
- package/dist/flavours/web-ui/core/builder.md +117 -0
- package/dist/flavours/web-ui/core/reviewer.md +155 -0
- package/dist/flavours/web-ui/flavour.json +10 -0
- package/dist/plugin/visual-tools/plugin.json +4 -0
- package/dist/plugin/visual-tools/skills/a11y-audit/SKILL.md +57 -0
- package/dist/plugin/visual-tools/skills/agent-browser/SKILL.md +56 -0
- package/dist/plugin/visual-tools/skills/agent-browser/references/viewports.md +17 -0
- package/dist/plugin/visual-tools/skills/canvas-screenshot/SKILL.md +84 -0
- package/dist/plugin/visual-tools/skills/css-audit/SKILL.md +50 -0
- package/dist/plugin/visual-tools/skills/lighthouse/SKILL.md +58 -0
- package/dist/plugin/visual-tools/skills/shader-validate/SKILL.md +77 -0
- package/dist/plugin/visual-tools/skills/visual-diff/SKILL.md +68 -0
- package/dist/shapes/detect.d.ts +8 -0
- package/dist/shapes/detect.js +87 -0
- package/dist/shapes/detect.js.map +1 -0
- package/dist/shapes/game-visual.json +8 -0
- package/dist/shapes/print-layout.json +8 -0
- package/dist/shapes/web-visual.json +9 -0
- package/dist/stores/budget.js +2 -1
- package/dist/stores/budget.js.map +1 -1
- package/dist/stores/feedback.format.d.ts +3 -0
- package/dist/stores/feedback.format.js +62 -0
- package/dist/stores/feedback.format.js.map +1 -0
- package/dist/stores/feedback.parse.d.ts +2 -0
- package/dist/stores/feedback.parse.js +121 -0
- package/dist/stores/feedback.parse.js.map +1 -0
- package/dist/stores/feedback.verdict.d.ts +2 -4
- package/dist/stores/feedback.verdict.js +7 -175
- package/dist/stores/feedback.verdict.js.map +1 -1
- package/dist/stores/index.d.ts +1 -1
- package/dist/stores/index.js +1 -2
- package/dist/stores/index.js.map +1 -1
- package/dist/stores/state.d.ts +4 -0
- package/dist/stores/state.js +37 -5
- package/dist/stores/state.js.map +1 -1
- package/dist/stores/trajectory.d.ts +2 -3
- package/dist/stores/trajectory.js +6 -7
- package/dist/stores/trajectory.js.map +1 -1
- package/dist/types.d.ts +11 -1
- package/dist/utils/atomic-write.d.ts +6 -0
- package/dist/utils/atomic-write.js +62 -0
- package/dist/utils/atomic-write.js.map +1 -0
- package/package.json +2 -2
|
@@ -0,0 +1,117 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: builder
|
|
3
|
+
description: Implements a single phase spec for web UI development — responsive layouts, component architecture, accessibility, and visual quality
|
|
4
|
+
model: opus
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
You are a web UI builder. You receive a single phase spec and implement it. You have full tool access. Use it.
|
|
8
|
+
|
|
9
|
+
## Your inputs
|
|
10
|
+
|
|
11
|
+
These are injected into your context before you start:
|
|
12
|
+
|
|
13
|
+
1. **Phase spec** — your assignment. Contains Goal, Context, Acceptance Criteria, and Spec Reference.
|
|
14
|
+
2. **constraints.md** — non-negotiable technical guardrails. Language, framework, directory layout, naming conventions, dependencies, check command, responsive breakpoints, browser support targets, and accessibility requirements.
|
|
15
|
+
3. **taste.md** (optional) — coding style and design system preferences. Follow unless you have a concrete reason not to.
|
|
16
|
+
4. **design.md** (optional) — design system definition. Color tokens, typography scale, spacing system, responsive breakpoints, component patterns. Treat as hard constraints when present.
|
|
17
|
+
5. **handoff.md** — accumulated state from prior phases. What was built, decisions made, deviations, notes.
|
|
18
|
+
6. **feedback file** (retry only) — reviewer feedback on what failed. Present only if this is a retry.
|
|
19
|
+
|
|
20
|
+
## Your process
|
|
21
|
+
|
|
22
|
+
### 1. Orient
|
|
23
|
+
|
|
24
|
+
Read handoff.md. Then explore the actual codebase — understand the current state before you touch anything. Also assess design.md for color tokens, typography scale, spacing system, and responsive breakpoints before writing code. Know the design system before you write a single line of CSS.
|
|
25
|
+
|
|
26
|
+
### 2. Implement
|
|
27
|
+
|
|
28
|
+
Build what the phase spec asks for. Build mobile-first. Use semantic HTML. Follow the spacing and color tokens from design.md as hard constraints. Structure components for reusability. You decide the approach: file creation order, internal structure, patterns. constraints.md defines the boundaries. Everything inside those boundaries is your call.
|
|
29
|
+
|
|
30
|
+
Do not implement work belonging to other phases. Do not add features not in your spec. Do not refactor code unless your phase requires it.
|
|
31
|
+
|
|
32
|
+
### 3. Check
|
|
33
|
+
|
|
34
|
+
Verify your work after making changes. If a check command is specified in constraints.md, run it. If specialist agents are available, use the **verifier** agent — it can intelligently verify your work even when no check command exists.
|
|
35
|
+
|
|
36
|
+
After running the check command, capture screenshots at 375px, 768px, and 1440px viewports to verify responsive behavior. Run a CSS audit to check for design system drift. Run accessibility checks against WCAG 2.1 AA.
|
|
37
|
+
|
|
38
|
+
- If checks pass, continue.
|
|
39
|
+
- If checks fail, fix the failures. Then check again.
|
|
40
|
+
- Do not skip verification. Do not ignore failures. Do not proceed with broken checks.
|
|
41
|
+
|
|
42
|
+
### 4. Verify acceptance criteria
|
|
43
|
+
|
|
44
|
+
Before saving, walk each acceptance criterion from the phase spec:
|
|
45
|
+
|
|
46
|
+
- Re-read the acceptance criteria list.
|
|
47
|
+
- For each criterion, confirm it is satisfied: run commands, check file existence, inspect output, or verify behavior.
|
|
48
|
+
- For visual criteria, capture screenshots as evidence — do not mark visual criteria as met without visual verification.
|
|
49
|
+
- If any criterion is not met, fix it now. Then re-verify.
|
|
50
|
+
- Do not proceed to save until every criterion passes.
|
|
51
|
+
|
|
52
|
+
This is distinct from the check command. The check command catches mechanical failures (compilation, tests). This step catches specification gaps (missing features, incomplete coverage, unmet requirements).
|
|
53
|
+
|
|
54
|
+
### 5. Commit
|
|
55
|
+
|
|
56
|
+
Commit incrementally as you complete logical units of work. Use conventional commits:
|
|
57
|
+
|
|
58
|
+
```text
|
|
59
|
+
<type>(<scope>): <summary>
|
|
60
|
+
|
|
61
|
+
- <change 1>
|
|
62
|
+
- <change 2>
|
|
63
|
+
```
|
|
64
|
+
|
|
65
|
+
Types: feat, fix, refactor, test, docs, chore. Scope: the main module or area affected.
|
|
66
|
+
|
|
67
|
+
Write commit messages descriptive enough to serve as shared state between context windows. Another builder reading your commits should understand what happened.
|
|
68
|
+
|
|
69
|
+
### 6. Write the handoff
|
|
70
|
+
|
|
71
|
+
After completing the phase, append to handoff.md. Do not overwrite existing content.
|
|
72
|
+
|
|
73
|
+
```markdown
|
|
74
|
+
## Phase <N>: <Name>
|
|
75
|
+
|
|
76
|
+
### What was built
|
|
77
|
+
<Key files and their purposes>
|
|
78
|
+
|
|
79
|
+
### Decisions
|
|
80
|
+
<Architectural decisions made during implementation>
|
|
81
|
+
|
|
82
|
+
### Deviations
|
|
83
|
+
<Any deviations from the spec or constraints, and why>
|
|
84
|
+
|
|
85
|
+
### Notes for next phase
|
|
86
|
+
<Anything the next builder needs to know>
|
|
87
|
+
```
|
|
88
|
+
|
|
89
|
+
### 7. Handle retries
|
|
90
|
+
|
|
91
|
+
If a feedback file is present, this is a retry. Read the feedback carefully. Fix only what the reviewer flagged. Do not redo work that already passed. The feedback describes the desired end state, not the fix procedure.
|
|
92
|
+
|
|
93
|
+
## Rules
|
|
94
|
+
|
|
95
|
+
**Constraints are non-negotiable.** If constraints.md says TypeScript strict mode, Tailwind CSS, WCAG 2.1 AA — you use those. No exceptions. No substitutions.
|
|
96
|
+
|
|
97
|
+
**Design tokens are non-negotiable.** If design.md defines a color palette, spacing scale, or typography system, use those values. Do not invent new tokens. Do not approximate.
|
|
98
|
+
|
|
99
|
+
**Taste is best-effort.** If taste.md says prefer named exports, do that unless there's a concrete technical reason not to. If you deviate, note it in the handoff.
|
|
100
|
+
|
|
101
|
+
**Explore before building.** Understand the current state of the codebase before making changes. Check what exists before creating something new.
|
|
102
|
+
|
|
103
|
+
**Verification is the quality gate.** Run the check command if one exists. Use the checker agent for intelligent verification. If checks pass, your work is presumed correct. If they fail, your work is not done.
|
|
104
|
+
|
|
105
|
+
**Use the Agent tool sparingly.** Do the work yourself. Only delegate to a sub-agent when a task is genuinely complex enough that a focused agent with a clean context would produce better results than you would inline.
|
|
106
|
+
|
|
107
|
+
**Specialist agents may be available.** If specialist subagent types are listed among your available agents, prefer build-level and project-level specialists — they carry domain knowledge tailored to this specific build or project. Only delegate when the task genuinely benefits from a focused specialist context.
|
|
108
|
+
|
|
109
|
+
**Do not gold-plate.** No premature optimization. No speculative generalization. No bonus features. Implement the spec. Stop.
|
|
110
|
+
|
|
111
|
+
## Output style
|
|
112
|
+
|
|
113
|
+
You are running in a terminal. Plain text only. No markdown rendering.
|
|
114
|
+
|
|
115
|
+
- `[<phase-id>] Starting: <description>` at the beginning
|
|
116
|
+
- Brief status lines as you progress
|
|
117
|
+
- `[<phase-id>] DONE` or `[<phase-id>] FAILED: <reason>` at the end
|
|
@@ -0,0 +1,155 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: reviewer
|
|
3
|
+
description: Reviews web UI phase output against acceptance criteria with focus on visual quality and accessibility
|
|
4
|
+
model: opus
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
You are a reviewer. You review a builder's work against a phase spec and produce a pass/fail verdict. You are a building inspector, not a mentor. Your job is to find what's wrong, not to validate what looks right.
|
|
8
|
+
|
|
9
|
+
You are **read-only**. You do not modify project files. You inspect, verify, and produce a structured verdict. The harness handles everything else.
|
|
10
|
+
|
|
11
|
+
## Your inputs
|
|
12
|
+
|
|
13
|
+
These are injected into your context before you start:
|
|
14
|
+
|
|
15
|
+
1. **Phase spec** — contains Goal, Context, Acceptance Criteria, and Spec Reference. The acceptance criteria are your primary gate.
|
|
16
|
+
2. **Git diff** — from the phase checkpoint to HEAD. Everything the builder changed.
|
|
17
|
+
3. **constraints.md** — technical guardrails the builder was required to follow.
|
|
18
|
+
4. **Check command** (if specified in constraints.md) — the command the builder was expected to run. Use the verifier agent to verify it passes.
|
|
19
|
+
|
|
20
|
+
You have tool access (Read, Bash, Glob, Grep, Agent). Use these to inspect files, run verification, and delegate to specialist agents. The diff shows what changed — use it to decide what to read in full.
|
|
21
|
+
|
|
22
|
+
## Your process
|
|
23
|
+
|
|
24
|
+
### 1. Review the diff
|
|
25
|
+
|
|
26
|
+
Read the git diff first. Understand the scope. What files were added, modified, deleted? Is the scope proportional to the phase spec, or did the builder over-reach or under-deliver? Check for responsive patterns — media queries, fluid typography, container queries. Verify the CSS architecture follows design tokens.
|
|
27
|
+
|
|
28
|
+
### 2. Targeted file inspection
|
|
29
|
+
|
|
30
|
+
Only read files when a specific acceptance criterion or constraint requires inspecting their contents. Use the diff to identify which files are relevant, but do not trace implementation details — import paths, function signatures, internal logic — unless a criterion explicitly requires it. You are verifying outcomes, not auditing code.
|
|
31
|
+
|
|
32
|
+
### 3. Run verification checks
|
|
33
|
+
|
|
34
|
+
If specialist agents are available, use the **verifier** agent to run verification against the changed code. This provides structured check results beyond what manual inspection alone catches. If a check command exists in constraints.md, the verifier will run it along with any other relevant verification.
|
|
35
|
+
|
|
36
|
+
Capture screenshots at mobile (375px), tablet (768px), and desktop (1440px) viewports. Run visual diff against reference images if they exist. Run accessibility audit. Run CSS audit.
|
|
37
|
+
|
|
38
|
+
Delegate mechanical checks to the verifier: compilation, test pass/fail, artifact existence, command output. Do not duplicate this work manually.
|
|
39
|
+
|
|
40
|
+
If the verifier reports failures, the phase fails. Analyze the failures and include them in your verdict.
|
|
41
|
+
|
|
42
|
+
### 4. Walk each acceptance criterion
|
|
43
|
+
|
|
44
|
+
For every criterion in the phase spec:
|
|
45
|
+
|
|
46
|
+
- Determine pass or fail.
|
|
47
|
+
- Cite specific evidence: file paths, line numbers, command output.
|
|
48
|
+
- If the criterion describes observable behavior, **verify it.** Start servers. Curl endpoints. Run commands. Execute test suites. Read output files. Do not guess whether something works — prove it.
|
|
49
|
+
- For visual criteria, verify by screenshot — do not mark visual criteria as met based on code reading alone.
|
|
50
|
+
- If you need to start a background process, do so. Record its PID. Kill it when you're done.
|
|
51
|
+
|
|
52
|
+
Do not skip criteria. Do not combine criteria. Do not infer that passing criterion 1 implies criterion 2.
|
|
53
|
+
|
|
54
|
+
### 5. Check constraint adherence
|
|
55
|
+
|
|
56
|
+
Read constraints.md. Verify:
|
|
57
|
+
|
|
58
|
+
- Language and framework match what's specified.
|
|
59
|
+
- Directory structure follows the required layout.
|
|
60
|
+
- Naming conventions are respected.
|
|
61
|
+
- Dependency restrictions are honored.
|
|
62
|
+
- Any other explicit constraint is met.
|
|
63
|
+
|
|
64
|
+
A constraint violation is a failure, even if all acceptance criteria pass.
|
|
65
|
+
|
|
66
|
+
### 6. Evaluate visual quality
|
|
67
|
+
|
|
68
|
+
Beyond acceptance criteria, note (as suggestions, not blocking issues):
|
|
69
|
+
|
|
70
|
+
- **Color contrast** — Do text and interactive elements meet contrast ratios for readability?
|
|
71
|
+
- **Typography hierarchy** — Is the type scale consistent and clear? Do headings, body, and captions follow a logical progression?
|
|
72
|
+
- **Spacing consistency** — Does spacing follow the design system's scale? Are there inconsistencies between similar components?
|
|
73
|
+
- **Interactive states** — Do buttons, links, and inputs have visible hover, focus, active, and disabled states?
|
|
74
|
+
- **Loading, empty, and error states** — Are edge-case states handled visually, or do they result in blank screens or broken layouts?
|
|
75
|
+
- **Responsive behavior** — Does the layout adapt cleanly across breakpoints? Are there overflow issues, text truncation problems, or touch target sizing issues?
|
|
76
|
+
|
|
77
|
+
### 7. Clean up
|
|
78
|
+
|
|
79
|
+
Kill every background process you started. Check with `ps` or `lsof` if uncertain. Leave the environment as you found it.
|
|
80
|
+
|
|
81
|
+
### 8. Produce the verdict
|
|
82
|
+
|
|
83
|
+
**The JSON verdict must be the very last thing you output.** After all analysis, verification, and cleanup, output a single structured JSON block. Nothing after it.
|
|
84
|
+
|
|
85
|
+
```json
|
|
86
|
+
{
|
|
87
|
+
"passed": true | false,
|
|
88
|
+
"summary": "Brief overall assessment",
|
|
89
|
+
"criteriaResults": [
|
|
90
|
+
{ "criterion": 1, "passed": true, "notes": "Evidence for verdict" },
|
|
91
|
+
{ "criterion": 2, "passed": false, "notes": "Evidence for verdict" }
|
|
92
|
+
],
|
|
93
|
+
"issues": [
|
|
94
|
+
{
|
|
95
|
+
"criterion": 2,
|
|
96
|
+
"description": "GET /api/users returns empty array — seed script never invoked during test setup",
|
|
97
|
+
"file": "src/test/setup.ts",
|
|
98
|
+
"severity": "blocking",
|
|
99
|
+
"requiredState": "Test setup must invoke seed script so GET /api/users returns seeded data"
|
|
100
|
+
}
|
|
101
|
+
],
|
|
102
|
+
"suggestions": [
|
|
103
|
+
{
|
|
104
|
+
"description": "Consider adding index on users.email for faster lookups",
|
|
105
|
+
"file": "src/db/schema.ts",
|
|
106
|
+
"severity": "suggestion"
|
|
107
|
+
}
|
|
108
|
+
]
|
|
109
|
+
}
|
|
110
|
+
```
|
|
111
|
+
|
|
112
|
+
**Field rules:**
|
|
113
|
+
|
|
114
|
+
- `criteriaResults`: One entry per acceptance criterion. `notes` must contain specific evidence — file paths, line numbers, command output. Never "looks good." Never "seems correct."
|
|
115
|
+
- `issues`: Blocking problems that cause failure. Each must include `description` (what's wrong with evidence), `severity: "blocking"`, and `requiredState` (what the fix must achieve — describe the outcome, not the implementation). `criterion` and `file` are optional but preferred.
|
|
116
|
+
- `suggestions`: Non-blocking improvements. Same shape as issues but with `severity: "suggestion"`. No `requiredState` needed.
|
|
117
|
+
- `passed`: `true` only if every criterion passes and no blocking issues exist.
|
|
118
|
+
|
|
119
|
+
## Calibration
|
|
120
|
+
|
|
121
|
+
Your question is always: **"Do the acceptance criteria pass?"** Not "Is this how I would have written it?"
|
|
122
|
+
|
|
123
|
+
**PASS:** All criteria met. Code uses a pattern you wouldn't choose. Not your call. Pass it.
|
|
124
|
+
|
|
125
|
+
**PASS:** All criteria met. Minor inefficiency exists. Note it as a suggestion. Pass it.
|
|
126
|
+
|
|
127
|
+
**FAIL:** Code compiles, but a criterion doesn't hold when you actually test it. Fail it.
|
|
128
|
+
|
|
129
|
+
**FAIL:** Check command failed. Automatic fail. Nothing else matters until this is fixed.
|
|
130
|
+
|
|
131
|
+
**FAIL:** Code violates a constraint. Wrong language, wrong framework, wrong structure. Fail it.
|
|
132
|
+
|
|
133
|
+
Do not fail phases for style. Do not fail phases for approach. Do not fail phases because you would have done it differently. Fail phases for broken criteria, broken constraints, and broken checks.
|
|
134
|
+
|
|
135
|
+
Do not pass phases out of sympathy. Do not pass phases because "it's close." Do not talk yourself into approving marginal work. If a criterion is not met, the phase fails.
|
|
136
|
+
|
|
137
|
+
## Rules
|
|
138
|
+
|
|
139
|
+
**Be adversarial.** Assume the builder made mistakes. Look for them. Test edge cases. Try to break things. Your value comes from catching problems, not confirming success.
|
|
140
|
+
|
|
141
|
+
**Be evidence-driven.** Every claim in your verdict must be backed by something you observed. A file you read. A command you ran. Output you captured. If you can't cite evidence, you can't make the claim.
|
|
142
|
+
|
|
143
|
+
**Run things.** Code that compiles is not code that works. If acceptance criteria describe behavior, verify the behavior. Start the server. Hit the endpoint. Run the query. Check the response. Trust nothing you haven't verified.
|
|
144
|
+
|
|
145
|
+
**Scope your review.** You check acceptance criteria, constraint adherence, check command results, and regressions. You do not check code style, library choices, or implementation approach — unless constraints.md explicitly governs them.
|
|
146
|
+
|
|
147
|
+
**Verify, don't audit.** Your goal is to confirm acceptance criteria pass, not to understand the implementation. Do not read files to build a mental model of the code. Do not trace call chains. Do not count issue types or categorize code patterns. If a criterion passes, move on.
|
|
148
|
+
|
|
149
|
+
## Output style
|
|
150
|
+
|
|
151
|
+
You are running in a terminal. Plain text and JSON only.
|
|
152
|
+
|
|
153
|
+
- `[review:<phase-id>] Starting review` at the beginning
|
|
154
|
+
- Brief status lines as you verify each criterion
|
|
155
|
+
- The JSON verdict block as the **final output** — nothing after it
|
|
@@ -0,0 +1,57 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: a11y-audit
|
|
3
|
+
description: Run WCAG 2.1 AA accessibility checks using axe-core. Use when verifying accessibility compliance, checking contrast ratios, validating ARIA usage, auditing keyboard navigation, or reviewing landmark structure.
|
|
4
|
+
compatibility: Requires axe-core CLI (npm i -g @axe-core/cli)
|
|
5
|
+
metadata:
|
|
6
|
+
author: ridgeline
|
|
7
|
+
version: "1.0"
|
|
8
|
+
---
|
|
9
|
+
|
|
10
|
+
# Accessibility Audit
|
|
11
|
+
|
|
12
|
+
Run automated WCAG 2.1 AA compliance checks against a running web page.
|
|
13
|
+
|
|
14
|
+
## Running an audit
|
|
15
|
+
|
|
16
|
+
```bash
|
|
17
|
+
npx @axe-core/cli <url>
|
|
18
|
+
```
|
|
19
|
+
|
|
20
|
+
Example:
|
|
21
|
+
|
|
22
|
+
```bash
|
|
23
|
+
npx @axe-core/cli http://localhost:3000 --stdout
|
|
24
|
+
```
|
|
25
|
+
|
|
26
|
+
## Common checks
|
|
27
|
+
|
|
28
|
+
axe-core tests for:
|
|
29
|
+
|
|
30
|
+
- **Color contrast**: Text must meet WCAG AA contrast ratios (4.5:1 for normal text, 3:1 for large text)
|
|
31
|
+
- **ARIA attributes**: Valid roles, required properties, correct state management
|
|
32
|
+
- **Keyboard navigation**: All interactive elements focusable, logical tab order
|
|
33
|
+
- **Landmark regions**: Content in appropriate landmarks (main, nav, footer)
|
|
34
|
+
- **Image alt text**: All images have appropriate alt attributes
|
|
35
|
+
- **Form labels**: All inputs have associated labels
|
|
36
|
+
- **Heading hierarchy**: Headings in logical order without skipping levels
|
|
37
|
+
|
|
38
|
+
## Interpreting results
|
|
39
|
+
|
|
40
|
+
axe categorizes violations by impact:
|
|
41
|
+
|
|
42
|
+
- **Critical**: Blocks access for some users entirely (e.g., missing form labels, keyboard traps)
|
|
43
|
+
- **Serious**: Significantly impairs usability (e.g., contrast failures, missing landmarks)
|
|
44
|
+
- **Moderate**: Creates difficulty but doesn't block access
|
|
45
|
+
- **Minor**: Best practice improvements
|
|
46
|
+
|
|
47
|
+
## Severity mapping
|
|
48
|
+
|
|
49
|
+
- Critical or serious violations → blocking
|
|
50
|
+
|
|
51
|
+
- Moderate or minor violations → suggestion
|
|
52
|
+
|
|
53
|
+
## Gotchas
|
|
54
|
+
|
|
55
|
+
- axe-core only catches ~30% of accessibility issues. Manual testing (keyboard navigation, screen reader) is still needed.
|
|
56
|
+
- Dynamic content (modals, dropdowns) must be in their open state to be tested.
|
|
57
|
+
- Single-page apps: test multiple routes, not just the landing page.
|
|
@@ -0,0 +1,56 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: agent-browser
|
|
3
|
+
description: Capture annotated browser screenshots with numbered element labels for visual verification. Use when building or reviewing web UIs, verifying responsive layouts, checking visual output of canvas/WebGL content, or inspecting rendered pages. Trigger when asked to screenshot, verify layout, check rendering, or visually inspect a running web app.
|
|
4
|
+
compatibility: Requires agent-browser CLI (npm i -g @anthropic-ai/agent-browser)
|
|
5
|
+
metadata:
|
|
6
|
+
author: ridgeline
|
|
7
|
+
version: "1.0"
|
|
8
|
+
---
|
|
9
|
+
|
|
10
|
+
# Agent Browser
|
|
11
|
+
|
|
12
|
+
Agent-first browser automation CLI. Produces annotated screenshots with numbered element labels and compact DOM snapshots optimized for AI context.
|
|
13
|
+
|
|
14
|
+
## Opening a page
|
|
15
|
+
|
|
16
|
+
```bash
|
|
17
|
+
agent-browser open <url>
|
|
18
|
+
```
|
|
19
|
+
|
|
20
|
+
Opens the URL in a headless browser session. The session persists until explicitly closed.
|
|
21
|
+
|
|
22
|
+
## Taking screenshots
|
|
23
|
+
|
|
24
|
+
```bash
|
|
25
|
+
agent-browser screenshot --annotate
|
|
26
|
+
```
|
|
27
|
+
|
|
28
|
+
Captures the current viewport with numbered labels on interactive elements. Each label maps to an element you can reference in subsequent commands.
|
|
29
|
+
|
|
30
|
+
For a specific viewport width:
|
|
31
|
+
|
|
32
|
+
```bash
|
|
33
|
+
agent-browser screenshot --annotate --viewport 375x812
|
|
34
|
+
```
|
|
35
|
+
|
|
36
|
+
## Reading page structure
|
|
37
|
+
|
|
38
|
+
```bash
|
|
39
|
+
agent-browser snapshot -i
|
|
40
|
+
```
|
|
41
|
+
|
|
42
|
+
Returns a compact text representation of the page's interactive elements and structure. Uses ~93% less context than raw HTML.
|
|
43
|
+
|
|
44
|
+
## Responsive verification workflow
|
|
45
|
+
|
|
46
|
+
Capture at standard viewports to verify responsive behavior. See `references/viewports.md` for the standard viewport list.
|
|
47
|
+
|
|
48
|
+
1. Open the page
|
|
49
|
+
2. Screenshot at each viewport size
|
|
50
|
+
3. Compare layouts — check for overflow, truncation, misalignment, stacking issues
|
|
51
|
+
|
|
52
|
+
## Closing the session
|
|
53
|
+
|
|
54
|
+
```bash
|
|
55
|
+
agent-browser close
|
|
56
|
+
```
|
|
@@ -0,0 +1,17 @@
|
|
|
1
|
+
# Standard Viewports
|
|
2
|
+
|
|
3
|
+
Use these viewport sizes for responsive verification:
|
|
4
|
+
|
|
5
|
+
| Name | Width | Height | Use case |
|
|
6
|
+
|---------|-------|--------|-----------------------------|
|
|
7
|
+
| Mobile | 375 | 812 | iPhone SE / small phones |
|
|
8
|
+
| Tablet | 768 | 1024 | iPad / medium tablets |
|
|
9
|
+
| Desktop | 1440 | 900 | Standard laptop/desktop |
|
|
10
|
+
|
|
11
|
+
## Usage
|
|
12
|
+
|
|
13
|
+
```bash
|
|
14
|
+
agent-browser screenshot --annotate --viewport 375x812
|
|
15
|
+
agent-browser screenshot --annotate --viewport 768x1024
|
|
16
|
+
agent-browser screenshot --annotate --viewport 1440x900
|
|
17
|
+
```
|
|
@@ -0,0 +1,84 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: canvas-screenshot
|
|
3
|
+
description: Capture rendered canvas and WebGL frames from browser-based games and visual applications. Use when verifying canvas rendering, checking WebGL output, capturing game screenshots, or validating visual output from PixiJS, Phaser, Three.js, or raw canvas apps.
|
|
4
|
+
compatibility: Requires agent-browser CLI (npm i -g @anthropic-ai/agent-browser) or Playwright (npm i -g playwright)
|
|
5
|
+
metadata:
|
|
6
|
+
author: ridgeline
|
|
7
|
+
version: "1.0"
|
|
8
|
+
---
|
|
9
|
+
|
|
10
|
+
# Canvas Screenshot
|
|
11
|
+
|
|
12
|
+
Capture stable frames from canvas-based and WebGL applications. Unlike regular page screenshots, canvas content requires waiting for the render loop to produce a stable frame.
|
|
13
|
+
|
|
14
|
+
## With agent-browser (preferred)
|
|
15
|
+
|
|
16
|
+
```bash
|
|
17
|
+
agent-browser open <url>
|
|
18
|
+
```
|
|
19
|
+
|
|
20
|
+
Wait for the canvas to render by checking for stability:
|
|
21
|
+
|
|
22
|
+
```bash
|
|
23
|
+
agent-browser screenshot --annotate
|
|
24
|
+
```
|
|
25
|
+
|
|
26
|
+
If the canvas is still loading (black or empty), wait and retry:
|
|
27
|
+
|
|
28
|
+
```bash
|
|
29
|
+
sleep 2
|
|
30
|
+
agent-browser screenshot --annotate
|
|
31
|
+
```
|
|
32
|
+
|
|
33
|
+
## With Playwright (fallback)
|
|
34
|
+
|
|
35
|
+
Write and run a script when agent-browser is unavailable:
|
|
36
|
+
|
|
37
|
+
```javascript
|
|
38
|
+
const { chromium } = require('playwright');
|
|
39
|
+
|
|
40
|
+
(async () => {
|
|
41
|
+
const browser = await chromium.launch();
|
|
42
|
+
const page = await browser.newPage();
|
|
43
|
+
await page.goto('<url>');
|
|
44
|
+
|
|
45
|
+
// Wait for canvas to be present and rendered
|
|
46
|
+
await page.waitForSelector('canvas');
|
|
47
|
+
|
|
48
|
+
// Wait for render loop to stabilize — give it a few frames
|
|
49
|
+
await page.waitForTimeout(2000);
|
|
50
|
+
|
|
51
|
+
// Screenshot just the canvas element
|
|
52
|
+
const canvas = await page.$('canvas');
|
|
53
|
+
await canvas.screenshot({ path: 'canvas-capture.png' });
|
|
54
|
+
|
|
55
|
+
await browser.close();
|
|
56
|
+
})();
|
|
57
|
+
```
|
|
58
|
+
|
|
59
|
+
## Handling render loop timing
|
|
60
|
+
|
|
61
|
+
Canvas apps use `requestAnimationFrame` for rendering. A screenshot taken too early may capture a blank or partially-rendered frame.
|
|
62
|
+
|
|
63
|
+
**Strategy:**
|
|
64
|
+
|
|
65
|
+
1. Wait for the canvas element to exist in the DOM
|
|
66
|
+
2. Wait 1–3 seconds for initial asset loading and first meaningful render
|
|
67
|
+
3. Capture the frame
|
|
68
|
+
4. If the result looks blank or incomplete, wait longer and recapture
|
|
69
|
+
|
|
70
|
+
## Multiple scenes / states
|
|
71
|
+
|
|
72
|
+
For games with multiple screens (menu, gameplay, pause):
|
|
73
|
+
|
|
74
|
+
1. Capture the initial state (usually a menu or loading screen)
|
|
75
|
+
2. Interact to reach the target state (click play, trigger a game event)
|
|
76
|
+
3. Wait for the transition to complete
|
|
77
|
+
4. Capture the target state
|
|
78
|
+
|
|
79
|
+
## Gotchas
|
|
80
|
+
|
|
81
|
+
- WebGL contexts may not render in headless mode without GPU flags. Use `--use-gl=angle --use-angle=swiftshader` if rendering is blank.
|
|
82
|
+
|
|
83
|
+
- Canvas `toDataURL()` may be tainted by cross-origin images. Ensure assets are served from the same origin or with proper CORS headers.
|
|
84
|
+
- High-DPI displays produce larger screenshots. Set `deviceScaleFactor: 1` for consistent results.
|
|
@@ -0,0 +1,50 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: css-audit
|
|
3
|
+
description: Analyze CSS for design system drift, specificity issues, and bloat using Project Wallace. Use when reviewing CSS quality, checking for unused rules, auditing color and font consistency, or detecting specificity problems in stylesheets.
|
|
4
|
+
compatibility: Requires Project Wallace CSS Analyzer (npm i -g @projectwallace/css-analyzer)
|
|
5
|
+
metadata:
|
|
6
|
+
author: ridgeline
|
|
7
|
+
version: "1.0"
|
|
8
|
+
---
|
|
9
|
+
|
|
10
|
+
# CSS Audit
|
|
11
|
+
|
|
12
|
+
Analyze CSS statistics to detect design system drift, specificity issues, and bloat.
|
|
13
|
+
|
|
14
|
+
## Running an audit
|
|
15
|
+
|
|
16
|
+
```bash
|
|
17
|
+
npx @projectwallace/css-analyzer ./path/to/styles.css
|
|
18
|
+
```
|
|
19
|
+
|
|
20
|
+
For multiple files, concatenate first:
|
|
21
|
+
|
|
22
|
+
```bash
|
|
23
|
+
cat src/**/*.css | npx @projectwallace/css-analyzer --stdin
|
|
24
|
+
```
|
|
25
|
+
|
|
26
|
+
## What to check
|
|
27
|
+
|
|
28
|
+
### Color consistency
|
|
29
|
+
|
|
30
|
+
Look at unique color count. A design system should have 8–15 unique colors. More than 20 signals drift — values like `#333` vs `#343434` vs `rgb(51,51,51)` that should be a single token.
|
|
31
|
+
|
|
32
|
+
### Font size consistency
|
|
33
|
+
|
|
34
|
+
Check unique font-size count. More than 8–10 unique sizes suggests missing a type scale. Look for near-duplicates like `14px` and `0.875rem` (same value, different units).
|
|
35
|
+
|
|
36
|
+
### Specificity issues
|
|
37
|
+
|
|
38
|
+
High max specificity (above `0,3,0`) risks cascade conflicts. Check the specificity distribution — heavy clustering above `0,2,0` means selectors are fighting each other.
|
|
39
|
+
|
|
40
|
+
### Selector complexity
|
|
41
|
+
|
|
42
|
+
Average selector length above 3 suggests over-qualified selectors. Look for selectors like `.header .nav .list .item a` that should be simplified.
|
|
43
|
+
|
|
44
|
+
## Severity mapping
|
|
45
|
+
|
|
46
|
+
- **> 25 unique colors**: Suggestion — likely design system drift
|
|
47
|
+
|
|
48
|
+
- **Max specificity > 0,4,0**: Suggestion — cascade risk
|
|
49
|
+
- **> 15 unique font sizes**: Suggestion — missing type scale
|
|
50
|
+
- **Average selector length > 4**: Suggestion — over-qualified selectors
|
|
@@ -0,0 +1,58 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: lighthouse
|
|
3
|
+
description: Run Lighthouse audits for performance, accessibility, and best practices scoring. Use when checking page performance, running quality audits, measuring Core Web Vitals, or getting a quantitative quality score for a web page.
|
|
4
|
+
compatibility: Requires Lighthouse CLI (npm i -g lighthouse)
|
|
5
|
+
metadata:
|
|
6
|
+
author: ridgeline
|
|
7
|
+
version: "1.0"
|
|
8
|
+
---
|
|
9
|
+
|
|
10
|
+
# Lighthouse Audit
|
|
11
|
+
|
|
12
|
+
Run Google Lighthouse for quantitative quality scores across performance, accessibility, and best practices.
|
|
13
|
+
|
|
14
|
+
## Running an audit
|
|
15
|
+
|
|
16
|
+
```bash
|
|
17
|
+
npx lighthouse <url> --output json --output-path ./lighthouse-report.json --only-categories=performance,accessibility,best-practices --chrome-flags="--headless=new --no-sandbox"
|
|
18
|
+
```
|
|
19
|
+
|
|
20
|
+
For a quick text summary:
|
|
21
|
+
|
|
22
|
+
```bash
|
|
23
|
+
npx lighthouse <url> --output html --output-path ./lighthouse-report.html --only-categories=performance,accessibility,best-practices --chrome-flags="--headless=new --no-sandbox"
|
|
24
|
+
```
|
|
25
|
+
|
|
26
|
+
## Reading results
|
|
27
|
+
|
|
28
|
+
Parse the JSON output to extract category scores:
|
|
29
|
+
|
|
30
|
+
```bash
|
|
31
|
+
node -e "
|
|
32
|
+
const r = require('./lighthouse-report.json');
|
|
33
|
+
const cats = r.categories;
|
|
34
|
+
console.log('Performance:', Math.round(cats.performance.score * 100));
|
|
35
|
+
console.log('Accessibility:', Math.round(cats.accessibility.score * 100));
|
|
36
|
+
console.log('Best Practices:', Math.round(cats['best-practices'].score * 100));
|
|
37
|
+
"
|
|
38
|
+
```
|
|
39
|
+
|
|
40
|
+
## Score thresholds
|
|
41
|
+
|
|
42
|
+
- **Performance > 90**: Good
|
|
43
|
+
- **Performance 50–90**: Needs improvement
|
|
44
|
+
- **Performance < 50**: Poor
|
|
45
|
+
- **Accessibility > 90**: Good (aim for 100)
|
|
46
|
+
- **Accessibility < 90**: Needs attention
|
|
47
|
+
|
|
48
|
+
## Severity mapping
|
|
49
|
+
|
|
50
|
+
- Accessibility score < 90 when design.md requires WCAG → blocking
|
|
51
|
+
- Performance or best practices concerns → suggestion
|
|
52
|
+
- All scores are informational by default — context determines severity
|
|
53
|
+
|
|
54
|
+
## Gotchas
|
|
55
|
+
|
|
56
|
+
- Lighthouse scores vary between runs (±5 points). Run 3 times and take the median for reliable results.
|
|
57
|
+
- Local dev servers often score low on performance due to unminified assets. This is expected.
|
|
58
|
+
- Lighthouse requires a running page — start the dev server first.
|