npm - aw-ecc - Versions diffs - 1.4.32 → 1.4.48 - Mend

aw-ecc 1.4.32 → 1.4.48

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (259) hide show

package/.claude-plugin/plugin.json +1 -1
package/.cursor/INSTALL.md +7 -5
package/.cursor/hooks/adapter.js +41 -4
package/.cursor/hooks/after-agent-response.js +62 -0
package/.cursor/hooks/before-submit-prompt.js +7 -1
package/.cursor/hooks/post-tool-use-failure.js +21 -0
package/.cursor/hooks/post-tool-use.js +39 -0
package/.cursor/hooks/shared/aw-phase-definitions.js +53 -0
package/.cursor/hooks/shared/aw-phase-runner.js +3 -1
package/.cursor/hooks/subagent-start.js +22 -4
package/.cursor/hooks/subagent-stop.js +18 -1
package/.cursor/hooks.json +23 -2
package/.opencode/package.json +1 -1
package/AGENTS.md +3 -3
package/README.md +5 -5
package/commands/adk.md +52 -0
package/commands/build.md +22 -9
package/commands/deploy.md +12 -0
package/commands/execute.md +9 -0
package/commands/feature.md +333 -0
package/commands/investigate.md +18 -5
package/commands/plan.md +23 -9
package/commands/publish.md +65 -0
package/commands/review.md +12 -0
package/commands/ship.md +12 -0
package/commands/test.md +12 -0
package/commands/verify.md +9 -0
package/hooks/hooks.json +36 -0
package/manifests/install-components.json +8 -0
package/manifests/install-modules.json +83 -0
package/manifests/install-profiles.json +7 -0
package/package.json +2 -2
package/scripts/ci/validate-rules.js +51 -0
package/scripts/cursor-aw-home/hooks.json +23 -2
package/scripts/cursor-aw-hooks/adapter.js +41 -4
package/scripts/cursor-aw-hooks/before-submit-prompt.js +7 -1
package/scripts/hooks/aw-usage-commit-created.js +32 -0
package/scripts/hooks/aw-usage-post-tool-use-failure.js +56 -0
package/scripts/hooks/aw-usage-post-tool-use.js +242 -0
package/scripts/hooks/aw-usage-prompt-submit.js +112 -0
package/scripts/hooks/aw-usage-session-start.js +48 -0
package/scripts/hooks/aw-usage-stop.js +182 -0
package/scripts/hooks/aw-usage-telemetry-send.js +84 -0
package/scripts/hooks/cost-tracker.js +3 -23
package/scripts/hooks/shared/aw-phase-definitions.js +53 -0
package/scripts/hooks/shared/aw-phase-runner.js +3 -1
package/scripts/lib/aw-hook-contract.js +2 -2
package/scripts/lib/aw-pricing.js +306 -0
package/scripts/lib/aw-usage-telemetry.js +472 -0
package/scripts/lib/codex-hook-config.js +8 -8
package/scripts/lib/cursor-hook-config.js +25 -10
package/scripts/lib/install-targets/cursor-project.js +3 -0
package/scripts/lib/install-targets/helpers.js +20 -3
package/skills/aw-adk/SKILL.md +317 -0
package/skills/aw-adk/agents/analyzer.md +113 -0
package/skills/aw-adk/agents/comparator.md +113 -0
package/skills/aw-adk/agents/grader.md +115 -0
package/skills/aw-adk/assets/eval_review.html +76 -0
package/skills/aw-adk/eval-viewer/generate_review.py +164 -0
package/skills/aw-adk/eval-viewer/viewer.html +181 -0
package/skills/aw-adk/evals/eval-colocated-placement.md +84 -0
package/skills/aw-adk/evals/eval-create-agent.md +90 -0
package/skills/aw-adk/evals/eval-create-command.md +98 -0
package/skills/aw-adk/evals/eval-create-eval.md +89 -0
package/skills/aw-adk/evals/eval-create-rule.md +99 -0
package/skills/aw-adk/evals/eval-create-skill.md +97 -0
package/skills/aw-adk/evals/eval-delete-agent.md +79 -0
package/skills/aw-adk/evals/eval-delete-command.md +89 -0
package/skills/aw-adk/evals/eval-delete-rule.md +86 -0
package/skills/aw-adk/evals/eval-delete-skill.md +90 -0
package/skills/aw-adk/evals/eval-meta-eval-coverage.md +78 -0
package/skills/aw-adk/evals/eval-meta-eval-determinism.md +81 -0
package/skills/aw-adk/evals/eval-meta-eval-false-pass.md +81 -0
package/skills/aw-adk/evals/eval-score-accuracy.md +95 -0
package/skills/aw-adk/evals/eval-type-redirect.md +68 -0
package/skills/aw-adk/evals/evals.json +96 -0
package/skills/aw-adk/references/artifact-wiring.md +162 -0
package/skills/aw-adk/references/cross-ide-mapping.md +71 -0
package/skills/aw-adk/references/eval-placement-guide.md +183 -0
package/skills/aw-adk/references/external-resources.md +75 -0
package/skills/aw-adk/references/getting-started.md +66 -0
package/skills/aw-adk/references/registry-structure.md +152 -0
package/skills/aw-adk/references/rubric-agent.md +36 -0
package/skills/aw-adk/references/rubric-command.md +36 -0
package/skills/aw-adk/references/rubric-eval.md +36 -0
package/skills/aw-adk/references/rubric-meta-eval.md +132 -0
package/skills/aw-adk/references/rubric-rule.md +36 -0
package/skills/aw-adk/references/rubric-skill.md +36 -0
package/skills/aw-adk/references/schemas.md +222 -0
package/skills/aw-adk/references/template-agent.md +251 -0
package/skills/aw-adk/references/template-command.md +279 -0
package/skills/aw-adk/references/template-eval.md +176 -0
package/skills/aw-adk/references/template-rule.md +119 -0
package/skills/aw-adk/references/template-skill.md +123 -0
package/skills/aw-adk/references/type-classifier.md +98 -0
package/skills/aw-adk/references/writing-good-agents.md +227 -0
package/skills/aw-adk/references/writing-good-commands.md +258 -0
package/skills/aw-adk/references/writing-good-evals.md +271 -0
package/skills/aw-adk/references/writing-good-rules.md +214 -0
package/skills/aw-adk/references/writing-good-skills.md +159 -0
package/skills/aw-adk/scripts/aggregate-benchmark.py +190 -0
package/skills/aw-adk/scripts/lint-artifact.sh +211 -0
package/skills/aw-adk/scripts/score-artifact.sh +179 -0
package/skills/aw-adk/scripts/trigger-eval.py +192 -0
package/skills/aw-build/SKILL.md +19 -2
package/skills/aw-deploy/SKILL.md +65 -3
package/skills/aw-design/SKILL.md +156 -0
package/skills/aw-design/references/highrise-tokens.md +394 -0
package/skills/aw-design/references/micro-interactions.md +76 -0
package/skills/aw-design/references/prompt-template.md +160 -0
package/skills/aw-design/references/quality-checklist.md +70 -0
package/skills/aw-design/references/self-review.md +497 -0
package/skills/aw-design/references/stitch-workflow.md +127 -0
package/skills/aw-feature/SKILL.md +293 -0
package/skills/aw-investigate/SKILL.md +17 -0
package/skills/aw-plan/SKILL.md +34 -3
package/skills/aw-publish/SKILL.md +300 -0
package/skills/aw-publish/evals/eval-confirmation-gate.md +60 -0
package/skills/aw-publish/evals/eval-intent-detection.md +111 -0
package/skills/aw-publish/evals/eval-push-modes.md +67 -0
package/skills/aw-publish/evals/eval-rules-push.md +60 -0
package/skills/aw-publish/evals/evals.json +29 -0
package/skills/aw-publish/references/push-modes.md +38 -0
package/skills/aw-review/SKILL.md +88 -9
package/skills/aw-rules-review/SKILL.md +124 -0
package/skills/aw-rules-review/agents/openai.yaml +3 -0
package/skills/aw-rules-review/scripts/generate-review-template.mjs +323 -0
package/skills/aw-ship/SKILL.md +16 -0
package/skills/aw-spec/SKILL.md +15 -0
package/skills/aw-tasks/SKILL.md +15 -0
package/skills/aw-test/SKILL.md +16 -0
package/skills/aw-yolo/SKILL.md +4 -0
package/skills/diagnose/SKILL.md +121 -0
package/skills/diagnose/scripts/hitl-loop.template.sh +41 -0
package/skills/finish-only-when-green/SKILL.md +265 -0
package/skills/grill-me/SKILL.md +24 -0
package/skills/grill-with-docs/SKILL.md +92 -0
package/skills/grill-with-docs/adr-format.md +47 -0
package/skills/grill-with-docs/context-format.md +67 -0
package/skills/improve-codebase-architecture/SKILL.md +75 -0
package/skills/improve-codebase-architecture/deepening.md +37 -0
package/skills/improve-codebase-architecture/interface-design.md +44 -0
package/skills/improve-codebase-architecture/language.md +53 -0
package/skills/local-ghl-setup-from-screenshot/SKILL.md +538 -0
package/skills/tdd/SKILL.md +115 -0
package/skills/tdd/deep-modules.md +33 -0
package/skills/tdd/interface-design.md +31 -0
package/skills/tdd/mocking.md +59 -0
package/skills/tdd/refactoring.md +10 -0
package/skills/tdd/tests.md +61 -0
package/skills/to-issues/SKILL.md +62 -0
package/skills/to-prd/SKILL.md +75 -0
package/skills/using-aw-skills/SKILL.md +170 -237
package/skills/using-aw-skills/hooks/session-start.sh +11 -41
package/skills/zoom-out/SKILL.md +24 -0
package/.codex/hooks/aw-post-tool-use.sh +0 -6
package/.codex/hooks/aw-pre-tool-use.sh +0 -6
package/.codex/hooks/aw-session-start.sh +0 -25
package/.codex/hooks/aw-stop.sh +0 -6
package/.codex/hooks/aw-user-prompt-submit.sh +0 -10
package/.codex/hooks.json +0 -62
package/.cursor/rules/common-agents.md +0 -53
package/.cursor/rules/common-aw-routing.md +0 -43
package/.cursor/rules/common-coding-style.md +0 -52
package/.cursor/rules/common-development-workflow.md +0 -33
package/.cursor/rules/common-git-workflow.md +0 -28
package/.cursor/rules/common-hooks.md +0 -34
package/.cursor/rules/common-patterns.md +0 -35
package/.cursor/rules/common-performance.md +0 -59
package/.cursor/rules/common-security.md +0 -33
package/.cursor/rules/common-testing.md +0 -33
package/.cursor/skills/api-and-interface-design/SKILL.md +0 -75
package/.cursor/skills/article-writing/SKILL.md +0 -85
package/.cursor/skills/aw-brainstorm/SKILL.md +0 -115
package/.cursor/skills/aw-build/SKILL.md +0 -152
package/.cursor/skills/aw-build/evals/build-stage-cases.json +0 -28
package/.cursor/skills/aw-debug/SKILL.md +0 -49
package/.cursor/skills/aw-deploy/SKILL.md +0 -101
package/.cursor/skills/aw-deploy/evals/deploy-stage-cases.json +0 -32
package/.cursor/skills/aw-execute/SKILL.md +0 -47
package/.cursor/skills/aw-execute/references/mode-code.md +0 -47
package/.cursor/skills/aw-execute/references/mode-docs.md +0 -28
package/.cursor/skills/aw-execute/references/mode-infra.md +0 -44
package/.cursor/skills/aw-execute/references/mode-migration.md +0 -58
package/.cursor/skills/aw-execute/references/worker-implementer.md +0 -26
package/.cursor/skills/aw-execute/references/worker-parallel-worker.md +0 -23
package/.cursor/skills/aw-execute/references/worker-quality-reviewer.md +0 -23
package/.cursor/skills/aw-execute/references/worker-spec-reviewer.md +0 -23
package/.cursor/skills/aw-execute/scripts/build-worker-bundle.js +0 -229
package/.cursor/skills/aw-finish/SKILL.md +0 -111
package/.cursor/skills/aw-investigate/SKILL.md +0 -109
package/.cursor/skills/aw-plan/SKILL.md +0 -368
package/.cursor/skills/aw-prepare/SKILL.md +0 -118
package/.cursor/skills/aw-review/SKILL.md +0 -118
package/.cursor/skills/aw-ship/SKILL.md +0 -115
package/.cursor/skills/aw-spec/SKILL.md +0 -104
package/.cursor/skills/aw-tasks/SKILL.md +0 -138
package/.cursor/skills/aw-test/SKILL.md +0 -118
package/.cursor/skills/aw-verify/SKILL.md +0 -51
package/.cursor/skills/aw-yolo/SKILL.md +0 -111
package/.cursor/skills/browser-testing-with-devtools/SKILL.md +0 -81
package/.cursor/skills/bun-runtime/SKILL.md +0 -84
package/.cursor/skills/ci-cd-and-automation/SKILL.md +0 -71
package/.cursor/skills/code-simplification/SKILL.md +0 -74
package/.cursor/skills/content-engine/SKILL.md +0 -88
package/.cursor/skills/context-engineering/SKILL.md +0 -74
package/.cursor/skills/deprecation-and-migration/SKILL.md +0 -75
package/.cursor/skills/documentation-and-adrs/SKILL.md +0 -75
package/.cursor/skills/documentation-lookup/SKILL.md +0 -90
package/.cursor/skills/frontend-slides/SKILL.md +0 -184
package/.cursor/skills/frontend-slides/STYLE_PRESETS.md +0 -330
package/.cursor/skills/frontend-ui-engineering/SKILL.md +0 -68
package/.cursor/skills/git-workflow-and-versioning/SKILL.md +0 -75
package/.cursor/skills/idea-refine/SKILL.md +0 -84
package/.cursor/skills/incremental-implementation/SKILL.md +0 -75
package/.cursor/skills/investor-materials/SKILL.md +0 -96
package/.cursor/skills/investor-outreach/SKILL.md +0 -76
package/.cursor/skills/market-research/SKILL.md +0 -75
package/.cursor/skills/mcp-server-patterns/SKILL.md +0 -67
package/.cursor/skills/nextjs-turbopack/SKILL.md +0 -44
package/.cursor/skills/performance-optimization/SKILL.md +0 -77
package/.cursor/skills/security-and-hardening/SKILL.md +0 -70
package/.cursor/skills/using-aw-skills/SKILL.md +0 -290
package/.cursor/skills/using-aw-skills/evals/skill-trigger-cases.tsv +0 -25
package/.cursor/skills/using-aw-skills/evals/test-skill-triggers.sh +0 -171
package/.cursor/skills/using-aw-skills/hooks/hooks.json +0 -9
package/.cursor/skills/using-aw-skills/hooks/session-start.sh +0 -67
package/.cursor/skills/using-platform-skills/SKILL.md +0 -163
package/.cursor/skills/using-platform-skills/evals/platform-selection-cases.json +0 -52
/package/.cursor/rules/{golang-coding-style.md → golang-coding-style.mdc} +0 -0
/package/.cursor/rules/{golang-hooks.md → golang-hooks.mdc} +0 -0
/package/.cursor/rules/{golang-patterns.md → golang-patterns.mdc} +0 -0
/package/.cursor/rules/{golang-security.md → golang-security.mdc} +0 -0
/package/.cursor/rules/{golang-testing.md → golang-testing.mdc} +0 -0
/package/.cursor/rules/{kotlin-coding-style.md → kotlin-coding-style.mdc} +0 -0
/package/.cursor/rules/{kotlin-hooks.md → kotlin-hooks.mdc} +0 -0
/package/.cursor/rules/{kotlin-patterns.md → kotlin-patterns.mdc} +0 -0
/package/.cursor/rules/{kotlin-security.md → kotlin-security.mdc} +0 -0
/package/.cursor/rules/{kotlin-testing.md → kotlin-testing.mdc} +0 -0
/package/.cursor/rules/{php-coding-style.md → php-coding-style.mdc} +0 -0
/package/.cursor/rules/{php-hooks.md → php-hooks.mdc} +0 -0
/package/.cursor/rules/{php-patterns.md → php-patterns.mdc} +0 -0
/package/.cursor/rules/{php-security.md → php-security.mdc} +0 -0
/package/.cursor/rules/{php-testing.md → php-testing.mdc} +0 -0
/package/.cursor/rules/{python-coding-style.md → python-coding-style.mdc} +0 -0
/package/.cursor/rules/{python-hooks.md → python-hooks.mdc} +0 -0
/package/.cursor/rules/{python-patterns.md → python-patterns.mdc} +0 -0
/package/.cursor/rules/{python-security.md → python-security.mdc} +0 -0
/package/.cursor/rules/{python-testing.md → python-testing.mdc} +0 -0
/package/.cursor/rules/{swift-coding-style.md → swift-coding-style.mdc} +0 -0
/package/.cursor/rules/{swift-hooks.md → swift-hooks.mdc} +0 -0
/package/.cursor/rules/{swift-patterns.md → swift-patterns.mdc} +0 -0
/package/.cursor/rules/{swift-security.md → swift-security.mdc} +0 -0
/package/.cursor/rules/{swift-testing.md → swift-testing.mdc} +0 -0
/package/.cursor/rules/{typescript-coding-style.md → typescript-coding-style.mdc} +0 -0
/package/.cursor/rules/{typescript-hooks.md → typescript-hooks.mdc} +0 -0
/package/.cursor/rules/{typescript-patterns.md → typescript-patterns.mdc} +0 -0
/package/.cursor/rules/{typescript-security.md → typescript-security.mdc} +0 -0
/package/.cursor/rules/{typescript-testing.md → typescript-testing.mdc} +0 -0

package/skills/aw-design/references/quality-checklist.md ADDED Viewed

@@ -0,0 +1,70 @@
+# Quality Checklist
+This is the **pass/fail contract** that the self-review loop (step 6) enforces against every screen, every state, and the index page. Items are grouped by review track:
+- **Track A** items are deterministic — checked by regex/grep during `references/self-review.md` Track A.
+- **Track B** items are visual — checked by `cursor-ide-browser` during Track B.
+The agent must reach ✅ on every item, or explicitly flag the remaining failures in `designs/REVIEW.md` before presenting.
+## Token compliance
+- [ ] Colors match `references/highrise-tokens.md` exactly — no rogue hex values
+- [ ] Only ONE accent color used (#155EEF — HighRise brand blue)
+- [ ] Status indicators are tiny 6px dots + gray text, not colored pills
+- [ ] Cards are white with thin border, no colored backgrounds
+- [ ] No heavy shadows at rest
+- [ ] Section gaps are 48px+
+- [ ] Metric numbers 30–36px (Display sm/md) with small gray labels
+- [ ] Tables have no zebra stripes
+- [ ] Data looks realistic (real names, plausible numbers, recent dates)
+## Micro-interactions
+- [ ] All interactive elements have `:hover`, `:focus-visible`, and `:active` states
+- [ ] Transitions are 0.15s ease (or 0.2s for larger movement)
+- [ ] `@media (prefers-reduced-motion: reduce)` media query present
+- [ ] Modal has entrance/exit animation
+- [ ] Toast slides in from top-right
+- [ ] Skeleton shimmer uses `@keyframes pulse`
+- [ ] Input focus ring renders correctly in both light and dark mode
+## Variants completeness
+- [ ] Every screen has default + loading + empty + error (+ modal if applicable)
+- [ ] Dark mode toggle works on every screen
+## Responsive (verify by actually resizing the viewport — don't just trust the code)
+Open each HTML file and resize through 320px, 768px, 1024px, and 1440px. Confirm at each width:
+- [ ] No horizontal scroll on `<body>` at any viewport width
+- [ ] **Mobile (≤767px):** sidebar hidden, hamburger drawer works, cards stack vertically, modals go full-screen, touch targets ≥44×44px, body font ≥14px
+- [ ] **Tablet (768–1023px):** sidebar collapsed to 64px icon rail, 2-column grid
+- [ ] **Desktop (1024–1279px):** full 240px sidebar with labels, grid uses 3+ columns
+- [ ] **Wide (≥1280px):** content capped at 1200px max-width, centered
+- [ ] Tables either scroll horizontally OR collapse to card layout on mobile — never overflow the viewport
+- [ ] Media queries use mobile-first `min-width` and are pure CSS (no JS layout switching)
+## Linked prototype
+- [ ] `index.html` exists and links to every screen + state variant
+- [ ] Sidebar nav items have working `<a href>` relative links
+- [ ] Current page highlighted in sidebar
+- [ ] Theme toggle persists via localStorage and applies to all previews
+- [ ] Navigation flow diagram present in index
+## Index page requirements
+The `.aw_docs/features/<slug>/designs/index.html` is mandatory and must include:
+- Feature name and 1-line description
+- Every screen as a card with:
+  - Screen name
+  - Thumbnail (Stitch screenshot URL, or iframe preview for HTML)
+  - Links to each state variant: default, empty, loading, error, modal-*
+- Navigation flow diagram (simple arrows showing screen → screen relationships)
+- Links to `design.md` (at feature root), `SCREEN_PLAN.md` (same folder), and `REVIEW.md` (same folder)
+- Theme toggle (light/dark) that persists to localStorage and applies to all previews
+The index follows the same Highrise tokens and micro-interactions — it must look as polished as the screens it links to.

package/skills/aw-design/references/self-review.md ADDED Viewed

@@ -0,0 +1,497 @@
+# Self-Review & Iterate
+Step 6 of the workflow. You don't "present" a design — you **prove it's production-ready** first. This file defines how.
+Run two tracks in order: **deterministic** (fast, regex) then **visual** (browser MCP). Fix findings, re-validate. Loop up to **3 iterations**. Stop early when all pass. If you can't reach a pass state in 3 iterations, write unresolved items to `REVIEW.md` and surface them to the user — don't hide them.
+---
+## Track A — Deterministic sweep
+Run these as shell checks from the feature root (`.aw_docs/features/<slug>/`). Every check produces zero or more findings. Zero findings across all checks = Track A passes.
+### A1. No rogue hex values
+Every hex color in every HTML must exist in `references/highrise-tokens.md`. Anything else is a rogue value.
+```bash
+# Harvest the allowed hex set once
+grep -hoE '#[0-9A-Fa-f]{6}' aw-ecc/skills/aw-design/references/highrise-tokens.md | sort -u > /tmp/allowed_hex.txt
+# Find all hex in generated designs, compare
+grep -rhoE '#[0-9A-Fa-f]{6}' designs/ | sort -u > /tmp/used_hex.txt
+comm -23 /tmp/used_hex.txt /tmp/allowed_hex.txt
+```
+Any output line is a finding: "rogue hex `<value>`". Locate each with `rg -n '<value>' designs/` and fix.
+**Allowed exceptions** (do not flag):
+- `#000000` in `rgba(0,0,0,…)` for shadows and scrim — present as `0 4px 6px -2px rgba(16, 24, 40, 0.03)` style already
+- Values inside HTML comments `<!-- … -->`
+### A2. Exactly one brand accent
+Only the blue family is a brand color. Any use of violet/purple/indigo/pink/etc. as a primary action color is a finding.
+```bash
+# If any of these 600-values appear, it's an accent misuse
+grep -rhnE '#(6938EF|7839EE|9E77ED|BA24D5|D444F1|E31B54|EF6820|F63D68|DD2590|2E90FA|0BA5EC|0086C9|06AED4|0E9384|15B79E|3CCB7F|66C61C|16B364|669F2A|EAAA08|EF6820)\b' designs/
+```
+Any match = finding. The only permitted non-blue accents are status colors (success `#12B76A`, warning `#F79009`, error `#F04438`) and only as status indicators — never as button fills or nav highlights.
+### A3. All state variants present
+For every screen folder, the set `{default, empty, loading, error}` must all exist. Modal is optional.
+```bash
+for dir in designs/*/; do
+  [ "$dir" = "designs/screenshots/" ] && continue
+  for state in default empty loading error; do
+    [ -f "$dir$state.html" ] || echo "MISSING: $dir$state.html"
+  done
+done
+```
+### A4. Responsive media queries
+Every screen HTML must contain all three breakpoints.
+```bash
+for f in designs/**/*.html; do
+  grep -q '@media (min-width: 768px)'  "$f" || echo "$f — missing tablet breakpoint"
+  grep -q '@media (min-width: 1024px)' "$f" || echo "$f — missing desktop breakpoint"
+  grep -q '@media (min-width: 1280px)' "$f" || echo "$f — missing wide breakpoint"
+done
+```
+### A5. Focus rings + reduced motion
+Every screen HTML must include `:focus-visible`, at least one `@keyframes`, and the `prefers-reduced-motion` fallback.
+```bash
+for f in designs/**/*.html; do
+  grep -q ':focus-visible'            "$f" || echo "$f — no :focus-visible"
+  grep -q '@keyframes'                "$f" || echo "$f — no @keyframes"
+  grep -q 'prefers-reduced-motion'    "$f" || echo "$f — no prefers-reduced-motion"
+done
+```
+### A6. Typography is on the scale
+Every font size — whether written as `font-size:` or inside `font:` shorthand — must be one of: `8 9 10 11 12 13 14 15 16 18 20 24 30 36 48 60 72` px (or `rem`/`em` equivalents, or a `var(--font-size-*)` reference).
+**A6a — `font-size:` declarations:**
+```bash
+grep -rhoE 'font-size:\s*[0-9.]+(px|rem|em)' designs/ \
+  | grep -vE 'font-size:\s*(8|9|10|11|12|13|14|15|16|18|20|24|30|36|48|60|72)(px|rem|em)' \
+  | sort -u
+```
+**A6b — `font:` shorthand (e.g. `font: 600 30px/38px Inter`):**
+```bash
+# Pull size tokens out of any "font:" shorthand, then filter to off-scale
+grep -rhoE 'font:\s*[^;}"]+' designs/ \
+  | grep -oE '[0-9.]+(px|rem|em)\b' \
+  | grep -vE '^(8|9|10|11|12|13|14|15|16|18|20|24|30|36|48|60|72)(px|rem|em)$' \
+  | sort -u
+```
+Any output from **either** check = off-scale size finding. Run both — `font:` shorthand is the sneaky one Stitch loves to emit.
+### A7. Sidebar restraint
+Sidebars are light, not colored.
+```bash
+# Expect sidebar bg to be #F9FAFB (gray-50) or white — never a brand color
+rg -n --multiline-dotall 'class="[^"]*sidebar[^"]*"[^>]*>.*?background[^;]*?(#155EEF|#EFF4FF|#D1E0FF)' designs/
+```
+Any match = finding. Brand blue belongs on the active nav item only (bg `#EFF4FF`, text `#155EEF`, 2px left border `#155EEF`).
+### A8. Index page completeness
+`designs/index.html` must exist and must link to every screen + state variant.
+```bash
+[ -f designs/index.html ] || echo "MISSING: designs/index.html"
+# Every state HTML must be referenced somewhere in index.html
+for f in designs/*/*.html; do
+  rel="${f#designs/}"
+  grep -q "$rel" designs/index.html || echo "index.html does not link $rel"
+done
+```
+### A9. Realistic data
+Spot-check for placeholder strings that leak past Stitch.
+```bash
+rg -n --ignore-case 'lorem ipsum|placeholder|foo bar|test data|john doe|jane doe|example\.com|dummy' designs/
+```
+Any match = finding. Replace with plausible domain-appropriate data.
+---
+## Track B — Visual sweep (browser MCP)
+Run once Track A is clean.
+### B0. Browser MCP selection (agent-portable)
+Track B requires a browser MCP that exposes the standard `browser_*` tool surface. The tool names below (`browser_navigate`, `browser_resize`, `browser_snapshot`, `browser_take_screenshot`, `browser_console_messages`, optional `browser_evaluate`) are identical across both supported servers — only the server you route to changes.
+| Environment | Recommended MCP | Notes |
+|---|---|---|
+| Codex | `playwright` (preconfigured in `~/.codex/config.toml` as `@playwright/mcp@latest`) | Portable default. Runs headless Chromium locally. |
+| Claude Desktop / Claude Code | `playwright` (install `@playwright/mcp`) | Same as Codex. |
+| Cursor | `cursor-ide-browser` **or** `playwright` | `cursor-ide-browser` opens a real Cursor tab (nice for the human to watch); Playwright is faster and headless. Either works. |
+Detect capability before starting Track B:
+```
+1. Check whether any of browser_navigate / browser_snapshot are registered.
+2. If yes → proceed.
+3. If no → mark Track B as SKIPPED in REVIEW.md with reason
+   "no browser MCP available in this environment" and downgrade the final
+   status to ⚠️ Shipped with partial verification. Track A is still enforceable.
+```
+**`file://` fallback (mandatory probe before the main sweep).** Some Playwright MCP configs and hardened Cursor setups refuse `file://` URLs for security. Probe once with the index page:
+```
+browser_navigate → file:///<abs>/designs/index.html
+```
+If the call errors with a security/permission/unsupported-scheme message, start a local HTTP server and use it for the rest of Track B:
+```bash
+# Run in a background shell (block_until_ms: 0). Note the PID.
+python3 -m http.server 8765 --directory <abs path to designs>
+```
+Then substitute `http://127.0.0.1:8765/...` for every `file://` URL in B1–B5. Kill the server in B6 teardown with the PID. Record in REVIEW.md which scheme was used (`file://` or `http://127.0.0.1:8765`).
+**Cursor-only step:** if you selected `cursor-ide-browser`, call `browser_lock { action: "lock" }` after the first navigation and `browser_lock { action: "unlock" }` at the end of Track B. Playwright MCP has no equivalent — skip it.
+### B1. Per-screen breakpoint pass
+Open `designs/index.html` once as a warm-up, then for each screen + state (loop across `designs/*/*.html`):
+```
+browser_navigate → <scheme>://…/<screen>/<state>.html
+for width in [320, 768, 1024, 1440]:
+    browser_resize { width, height: 900 }
+    browser_snapshot
+    browser_take_screenshot → designs/screenshots/<screen>-<state>-<width>.png
+```
+**What to look for in each snapshot + screenshot:**
+| Width | Must be true |
+|---|---|
+| 320 | No sidebar visible, no horizontal scroll, touch targets ≥44×44, body font ≥14px, modals full-screen if present, hero/brand panels must not push primary content below the fold |
+| 768 | Sidebar collapsed to 64px icon rail, 2-column grid where applicable |
+| 1024 | Full 240px sidebar with labels, ≥3-column grid on dashboards |
+| 1440 | Content capped at ~1200px max-width and centered — not edge-to-edge |
+For each violation, write a finding: `<screen>/<state> @ <width>px: <what is wrong>`.
+**Capture matrix (enforced head count).** Before marking B1 as ✅, compute:
+```bash
+# Count generated HTML files (exclude index.html and screenshots/)
+files=$(find designs/ -name '*.html' ! -name 'index.html' | wc -l | tr -d ' ')
+# Per-file expected captures at B1 (4 widths) + B2 (1 dark capture)
+expected=$(( files * 4 + files ))   # = files × 5
+# Actual captures on disk
+actual=$(find designs/screenshots/ -name '*.png' 2>/dev/null | wc -l | tr -d ' ')
+echo "Captures: $actual / $expected"
+```
+B1 + B2 combined PASS requires `actual >= expected`. If `actual < 0.9 × expected`, Track B is ❌ not partial — you skipped work. Record the ratio in REVIEW.md verbatim (e.g. `Captures: 18/20 (90%) PASS` or `Captures: 3/20 (15%) FAIL — most breakpoints never rendered`).
+### B2. Dark mode pass
+At 1440px width, toggle dark class and re-screenshot one screen per folder:
+```
+browser_navigate → file:///…/<screen>/default.html
+browser_resize   → 1440 × 900
+# Inject .dark on <html>
+browser_snapshot   (confirm dark class present)
+browser_take_screenshot → designs/screenshots/<screen>-default-1440-dark.png
+```
+Inspect: background near-black (not pure `#000`), text gray-100/200 (not pure white), borders visible at low contrast, accent blue still legible on dark surface.
+### B3. Cross-screen consistency
+Pick two screens in the feature (e.g., list view + detail view). At 1440 light mode:
+- Primary button: same height, padding, bg, hover shade
+- Input: same height, border, focus ring
+- Card: same border, radius, shadow (or lack thereof)
+- Sidebar: same width, item spacing, active-state styling
+Any visible inconsistency = finding.
+### B4. Console check
+After every navigation:
+```
+browser_console_messages
+```
+Any JS error or CSS parse error = finding.
+### B5. Computed-style spot-check (optional, Playwright only)
+If `browser_evaluate` is available (Playwright MCP exposes it; cursor-ide-browser does not), run computed-style assertions that regex can't catch. Pick one primary button and one focused input per screen and verify rendered values:
+```js
+// Inside browser_evaluate
+const btn = document.querySelector('.btn-primary, [data-role="primary"]');
+const s = getComputedStyle(btn);
+return {
+  bg: s.backgroundColor,           // must resolve to rgb(21, 94, 239) == #155EEF
+  radius: s.borderRadius,          // must be 8px
+  fontWeight: s.fontWeight,        // must be 500 or 600
+  minHeight: s.height              // must be ≥ 36px
+};
+```
+This catches cascade bugs (e.g. a `:root` override that silently broke the token) that Track A's hex grep can't see. Skip this section on cursor-ide-browser — not a finding, just `N/A` in REVIEW.md.
+### B6. Teardown
+```
+(if cursor-ide-browser) browser_lock { action: "unlock" }
+(if http fallback)     kill <pid of python3 http.server>
+```
+Playwright MCP auto-cleans on session end — no explicit unlock needed.
+---
+## Categorizing findings → fix method
+Every finding must be tagged with a fix method before applying. This keeps us off the Stitch quota.
+| Finding class | Fix method | Stitch cost |
+|---|---|---|
+| Rogue hex (A1, A2) | `sed -i` on the HTML — swap to the correct token hex | 0 |
+| Missing `@media` block (A4) | Direct edit — append the block to `<style>` | 0 |
+| Missing `:focus-visible` / `@keyframes` / `prefers-reduced-motion` (A5) | Direct edit — copy from `references/micro-interactions.md` | 0 |
+| Off-scale font-size (A6) | Direct edit — snap to nearest scale value | 0 |
+| Wrong sidebar bg (A7) | Direct edit — change background token | 0 |
+| Index page missing links (A8) | Direct edit — append `<a>` entries | 0 |
+| Placeholder data (A9) | Direct edit — substitute realistic values | 0 |
+| Missing state variant file (A3) | `stitch_generate-screen` (Flash) with state-variant prompt | 1 |
+| Cross-screen inconsistency (B3) | `stitch_apply-design-system` multi-select on affected screens | 1 total |
+| Layout broken at a breakpoint (B1) | `stitch_edit-screens` with specific instruction | 1 per screen |
+| Dark mode unreadable (B2) | Direct edit — adjust `.dark` overrides in CSS | 0 |
+| Architectural wrongness (wrong hierarchy, wrong primary CTA) | **Do not auto-fix.** Surface to user as a BLOCKER in REVIEW.md | 0 |
+**Hard rule:** never regenerate a whole screen to fix a rogue hex. If a finding has a 0-cost fix path, that is the only acceptable fix method.
+---
+## The iteration loop — what "done" actually means
+This is the skill's teeth. The loop is **not optional** and **not a single pass**. Treat each of these as a hard contract:
+1. **Every iteration runs both tracks, in full.** No skipping Track B because Track A still has findings — Track B catches things Track A can't see, and you need both signals every round.
+2. **Every iteration must apply fixes to the findings it produced.** An iteration where you ran the checks but didn't edit anything is not a real iteration.
+3. **Every iteration must produce evidence in REVIEW.md.** See "Per-iteration evidence" below.
+4. **The loop stops for exactly three reasons** — and only the first is a success:
+   | Stop condition | REVIEW.md status |
+   |---|---|
+   | Zero findings across both tracks | ✅ Production-ready |
+   | 3 iterations completed, some findings remain but count is decreasing | ⚠️ Shipped with known issues (only valid with **≥2 iterations of fixes on disk**) |
+   | Findings count stopped decreasing (fix regressed something) or a BLOCKER finding surfaced | ❌ Blocked — surface to user |
+   A status of ⚠️ is **invalid** if fewer than 2 iterations applied fixes. If you ran 1 iteration and have findings, the status is ❌ BLOCKED, not ⚠️. No shortcuts.
+### Pseudocode
+```
+iter = 1
+prev_count = infinity
+while iter <= 3:
+    findings_A = run_track_A()
+    findings_B = run_track_B()      # always runs, no skipping
+    findings   = findings_A + findings_B
+    write_iteration_evidence(iter, findings)   # to REVIEW.md § Iteration <iter>
+    if len(findings) == 0:
+        final_status = "✅"
+        break
+    if iter > 1 and len(findings) >= prev_count:
+        final_status = "❌"   # not converging
+        break
+    prev_count = len(findings)
+    apply_fixes(findings)              # <-- mandatory; iteration without this is void
+    record_fixes_applied(iter)         # to REVIEW.md § Iteration <iter> § Fixes
+    iter += 1
+if final_status is unset:
+    # Ran all 3, still have findings, but count decreased each time
+    final_status = "⚠️"
+    assert iterations_with_fixes_on_disk >= 2, "⚠️ requires ≥2 fix iterations"
+finalize_REVIEW_md(final_status, remaining=findings)
+```
+**Convergence check:** iteration N+1 must have strictly fewer findings than iteration N. If not, a fix regressed something — stop and write ❌ BLOCKED with the regression listed.
+**No ask-first shortcuts.** Do not stop after iteration 1 to ask the user "should I continue?" The contract says 3 iterations (or zero findings). Asking is a protocol violation.
+---
+## REVIEW.md — the evidence-required output contract
+`designs/REVIEW.md` is not a summary document. It is an **audit trail** that proves each iteration actually ran. An agent reading their own REVIEW.md should not be able to fake compliance — the file format demands pasted commands, numeric outputs, capture counts, and fix diffs. Anything less is non-compliant.
+### Per-check evidence requirement (Track A)
+Every A-check row in every iteration's section must include:
+1. The **exact command** that was run (copy-pasted, not paraphrased).
+2. The **raw output** or a match-count (e.g. `→ 3 matches` or `→ empty output, 0 findings`).
+3. A ✅/❌ verdict.
+A row without command + output is treated as **not run** and forces the status to ❌ BLOCKED, regardless of what the verdict column says.
+### Per-iteration capture count (Track B)
+B1 must record the capture matrix ratio before its ✅. If `actual < 0.9 × expected` the row is ❌ and forces a ❌ BLOCKED overall status.
+### Template
+```markdown
+# Design Review — <feature>
+**Status:** ✅ Production-ready  |  ⚠️ Shipped with known issues (requires ≥2 fix iterations)  |  ❌ Blocked
+**Iterations run:** N / 3
+**Iterations with fixes applied:** M   (M ≥ 2 required for ⚠️)
+**Browser MCP:** playwright | cursor-ide-browser | none (Track B skipped)
+**URL scheme used:** file:// | http://127.0.0.1:8765 (local server fallback)
+**Last reviewed:** <YYYY-MM-DD HH:MM TZ>
+## Summary
+<1–3 sentence plain-English state of the designs. If ⚠️ or ❌, lead with what is broken.>
+---
+## Iteration 1
+### Track A — deterministic
+| # | Check | Command run | Output | Verdict |
+|---|---|---|---|---|
+| A1 | rogue hex | `comm -23 /tmp/used_hex.txt /tmp/allowed_hex.txt` | `(empty)` | ✅ |
+| A2 | one brand accent | `grep -rhnE '#(6938EF\|7839EE\|...)' designs/` | `0 matches` | ✅ |
+| A3 | state variants | <loop script output> | `no MISSING lines` | ✅ |
+| A4 | responsive breakpoints | <loop output> | `0 warnings` | ✅ |
+| A5 | focus + motion | <loop output> | `0 warnings` | ✅ |
+| A6a | font-size: scale | `grep -rhoE 'font-size:...' \| grep -vE ...` | `(empty)` | ✅ |
+| A6b | font: shorthand | `grep -rhoE 'font:...' \| grep -oE ... \| grep -vE ...` | `14px` | ❌ 1 off-scale (in error.html) |
+| A7 | sidebar restraint | `rg -n --multiline-dotall 'class="[^"]*sidebar[^"]*"...' designs/` | `0 matches` | ✅ |
+| A8 | index completeness | `grep -q` loop over `designs/*/*.html` | `no unlinked lines` | ✅ |
+| A9 | realistic data | `rg -n --ignore-case 'lorem ipsum\|...' designs/` | `0 matches` | ✅ |
+### Track B — visual
+| # | Check | Evidence | Verdict |
+|---|---|---|---|
+| B1 | breakpoint + capture matrix | Captures: **20 / 20 (100%)**. files=4, widths=4, dark=4 | ✅ |
+| B2 | dark mode | 4 dark screenshots at 1440 captured | ✅ |
+| B3 | cross-screen consistency | Button heights 40/40/40 px; input borders identical; card radius 8/8/8 | ✅ |
+| B4 | console clean | No errors across 16 navigations | ✅ |
+| B5 | computed-style spot-check | `.btn-primary` bg `rgb(21, 94, 239)`, radius `8px`, weight `500`, height `40px` | ✅ |
+### Findings in this iteration
+1. **A6b / error.html** — `font: 600 14px/20px Inter` (14px not in scale; closest is 15px)
+2. **B1 / login/loading.html @ 320px** — brand hero panel occupies full viewport height, pushes form skeleton below fold
+### Fixes applied
+1. A6b → changed `font: 600 14px/20px Inter` to `font: 600 15px/22px Inter` in `error.html` line 147
+2. B1 → added `@media (max-width: 767px) { .brand-panel { display: none; } }` in `loading.html` line 62
+---
+## Iteration 2
+<same format — must show fewer findings than Iteration 1>
+---
+## Iteration 3
+<only present if Iteration 2 still had findings>
+---
+## Final status
+- **Status:** ✅ / ⚠️ / ❌
+- **Remaining findings:** <0 or list>
+## Known issues (only if status is ⚠️ or ❌)
+For each remaining finding after iteration 3:
+- **Severity:** blocker / minor
+- **Where:** `<screen>/<state>.html` (+ breakpoint if visual)
+- **What:** <description>
+- **Why it wasn't auto-fixed:** <reason — needs judgment, needs Stitch regen the user didn't authorize, architectural>
+- **Suggested next step:** <concrete action for the user>
+## Artifacts
+- Screenshots: `designs/screenshots/<screen>-<state>-<width>[-dark].png` (N files)
+- Source files reviewed: <count>
+```
+### Anti-fake rules (self-review enforcement)
+Before finalizing REVIEW.md, run these sanity checks on the file you just wrote:
+| Rule | How to verify |
+|---|---|
+| Every A-check row has a non-empty `Command run` column | `grep -cE '^\| A[0-9]' REVIEW.md` equals 10 (A1–A6b–A9) per iteration |
+| B1 row includes a `Captures: X / Y` fragment | `grep -c 'Captures: [0-9]' REVIEW.md` ≥ iteration count |
+| If status is ⚠️, at least 2 iterations contain a `## Fixes applied` subsection with non-empty body | count `## Fixes applied` sections ≥ 2 |
+| If status is ✅, the final iteration's findings list is literally empty | last `### Findings in this iteration` has no numbered items |
+If any rule fails, **the status is ❌ BLOCKED** — rewrite the missing evidence or downgrade honestly. Do not present ⚠️ without the fixes-on-disk proof.
+If status is ⚠️ or ❌, **explicitly flag this when presenting to the user** — don't bury it. The whole point of this review is that the agent is honest about what it couldn't verify or fix.
+---
+## Related skills (reference, not duplicate)
+These exist in the broader registry and cover adjacent but distinct concerns. Don't re-read them whole — point to specific sections when the need arises.
+- **`platform-design:pixel-fidelity-review`** — computed-style audit comparing a *Vue implementation* against an HTML design prototype. Opposite direction from us: they treat the HTML as ground truth; we audit the HTML itself. Borrow their `browser_evaluate` computed-style patterns for B5 if you need depth beyond the spot-check.
+- **`platform-design:auditor`** (subagent) — end-to-end design fidelity auditor with pass/fail verdicts. If you need a full parallel review run (not just self-check), delegate via Task tool.
+- **`platform-webapp-testing`** — Playwright helper scripts (`with_server.py`) for live dev servers. Not needed here (we use `file://` URLs) but relevant if a future step spins up a dev server to audit a built implementation.