npm - start-vibing - Versions diffs - 4.0.2 → 4.1.1 - Mend

start-vibing 4.0.2 → 4.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (67) hide show

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
 	"name": "start-vibing",
-	"version": "4.0.2",
+	"version": "4.1.1",
 	"description": "Setup Claude Code with 9 plugins, 6 community skills, and 8 MCP servers. Parallel install, auto-accept, superpowers + ralph-loop.",
 	"type": "module",
 	"bin": {

package/template/.claude/CLAUDE.md CHANGED Viewed

@@ -47,7 +47,7 @@ All phases run best-effort. MCP and plugin failures are non-blocking — `settin
 ```
 .claude/
-├── agents/          # 4 active subagents (flat structure)
+├── agents/          # 5 active subagents (flat structure)
 ├── skills/          # Custom + 5 community skills (auto-injected by description match)
 ├── scripts/         # Utility scripts
 ├── config/          # Project-specific configuration (JSON files)
@@ -56,11 +56,12 @@ All phases run best-effort. MCP and plugin failures are non-blocking — `settin
 ---
-## Agents (4 Active Subagents)
+## Agents (5 Active Subagents)
 | Agent | Model | Purpose |
 |-------|-------|---------|
-| **research-web** | sonnet | Researches best practices (2024-2026) before implementing new features |
+| **research-web** | sonnet | Researches best practices (2025-2026) with ontology-based structuring, output to `/docs/research/` |
+| **documenter** | sonnet | Analyzes sessions via git log/diff, writes changelog + technical docs + ADRs to `/docs/` |
 | **commit-manager** | haiku | Manages git commits, conventional format, merge workflow |
 | **claude-md-compactor** | sonnet | Compacts CLAUDE.md when it exceeds 40k chars |
 | **tester-unit** | sonnet | Creates unit tests with Vitest for new functions and utilities |
@@ -68,11 +69,65 @@ All phases run best-effort. MCP and plugin failures are non-blocking — `settin
 ### Agent Workflow Order
 ```
-implement -> quality gates -> ask user about docs -> commit-manager -> complete
+implement -> quality gates -> documenter -> commit-manager -> complete
 ```
 Agents are dispatched via the `Agent` tool with `subagent_type` matching agent names. They run autonomously and return results to the orchestrator.
+### Research Agent Details
+The research-web agent outputs findings to `/docs/research/[topic].md` (NOT to `.claude/skills/research-cache/`).
+**Research flow:**
+1. Check `/docs/research/` for existing findings (reuse if fresh < 3 months)
+2. Build ontology map (concepts → relationships → constraints)
+3. Search with `[topic] [aspect] [2025-2026] [context]` queries
+4. Triangulate (3+ sources for any claim)
+5. Save structured output to `/docs/research/`
+**For UI/UX tasks, the agent also runs:**
+- Competitor analysis (3-5 competitors, heuristic evaluation)
+- Design system pattern check (shadcn/ui, Radix, WCAG 2.1)
+- User flow mapping (happy path + 2 error paths)
+- Accessibility audit plan (axe, keyboard nav, screen reader)
+### Documenter Agent Details
+The documenter agent runs after implementation, analyzes the session, and writes structured docs to `/docs/`.
+**Documentation flow:**
+1. Run `git log` + `git diff` to analyze what changed
+2. Classify changes: per-commit, per-feature, per-session
+3. Mini-research for technologies used (check `/docs/research/` first, else 1-2 queries)
+4. Write changelog (`/docs/changelog/`), technical docs (`/docs/technical/`), ADRs (`/docs/decisions/`)
+5. Update all indexes (`/docs/index.md` + per-folder indexes)
+**Output structure:**
+```
+/docs/
+├── index.md              # Root index
+├── changelog/            # Per-session changelogs
+│   ├── index.md
+│   └── YYYY-MM-DD-summary.md
+├── technical/            # Deep feature/architecture docs
+│   ├── index.md
+│   └── feature-name.md
+├── decisions/            # Architecture Decision Records
+│   ├── index.md
+│   └── NNNN-decision.md
+└── research/             # Managed by research-web
+    ├── index.md
+    └── topic.md
+```
+**Writing rules:**
+- Self-contained sections (AI RAG chunk retrieval)
+- What→Why→How progression (humans first)
+- Before→After pattern for all changes (mandatory)
+- Consistent terminology (one name per concept)
+- Official docs URLs for all technologies cited
+- Every doc linked from its folder index AND root index
 ---
 ## Plugins (9 via enabledPlugins)
@@ -279,7 +334,7 @@ The `.claude/settings.json` file contains `enabledPlugins` which is the fallback
 1. Use `/brainstorming` for creative work or feature design
 2. Use `/write-plan` for multi-step tasks
 3. Use `context7` (auto via plugin) for library docs
-4. Research if needed (research-web agent or web search)
+4. Research if needed (research-web agent → saves to `/docs/research/`)
 ### During Implementation
@@ -291,31 +346,39 @@ The `.claude/settings.json` file contains `enabledPlugins` which is the fallback
 1. Run `/simplify` (code-simplifier) on changed files
 2. Run quality gates (`bun run typecheck && lint && test`)
-3. Ask user if they want documentation in `/docs`
-4. Commit via commit-manager agent (FINAL step)
-5. Update CLAUDE.md with architecture changes
+3. Run documenter agent (changelog + technical docs + ADRs to `/docs/`)
+4. Update CLAUDE.md with architecture changes
+5. Commit via commit-manager agent (FINAL step)
 ---
 ## Documentation Policy
-> Documentation is NOT automatic. It lives in `/docs` and is created only when the user asks.
+> Documentation is automatic via the **documenter agent**. It lives in `/docs/`.
 ### After Completing Work
-Always ask the user:
-```
-Done! Finished [task]. Want me to:
-1. Document this in /docs?
-2. Move on to something else?
-```
+The documenter agent runs automatically after implementation:
+1. Analyzes git log/diff for the session
+2. Writes changelog + technical docs + ADRs as needed
+3. Updates all indexes
+4. Docs are ready for commit
+### Research Output
+Research findings go to `/docs/research/[topic].md` — NOT to `.claude/skills/research-cache/`.
+- Ontology-based structure: concepts → relationships → constraints → implementation path
+- Freshness tracked per file (< 3 months fresh, 3-6 aging, 6-12 stale, > 12 outdated)
+- Always check existing research before running new searches
 ### What NOT to Do
-- Do NOT auto-generate domain docs
-- Do NOT maintain `.claude/skills/codebase-knowledge/domains/`
-- Do NOT run documenter or domain-updater agents (they were removed in v4)
-- Do NOT create documentation without user consent
+- Do NOT maintain `.claude/skills/codebase-knowledge/domains/` (legacy)
+- Do NOT save research to `.claude/skills/research-cache/` — use `/docs/research/` instead
+- Do NOT mix doc types in one file (changelog ≠ technical ≠ decision)
+- Do NOT leave docs unlinked from indexes
+- Do NOT skip Before→After pattern in changelogs
 ---
@@ -326,6 +389,7 @@ All implementations MUST:
 - [ ] Pass typecheck (`bun run typecheck`)
 - [ ] Pass lint (`bun run lint`)
 - [ ] Pass unit tests (`bun run test`)
+- [ ] Be documented by documenter agent (changelog + technical/ADR if applicable)
 - [ ] Be committed with conventional commits
 - [ ] Have CLAUDE.md updated with architecture/rule changes
@@ -340,7 +404,9 @@ All implementations MUST:
 | Use `any` type                 | Defeats strict mode          |
 | Define types in `src/`         | Must be in `types/`          |
 | Commit directly to main        | Create feature/fix branches  |
-| Auto-document without asking   | Always ask user first        |
+| Skip documenter after implementation | Changelog + docs are mandatory |
+| Mix doc types in one file      | Changelog ≠ technical ≠ decision |
+| Leave docs unlinked from index | Undiscoverable docs are useless |
 | Skip superpowers for features  | Use brainstorming + TDD      |
 | Skip code-simplifier           | Run /simplify post-implementation |
 | Use MUI/Chakra                 | Use shadcn/ui + Radix        |

package/template/.claude/agents/sd-audit.md ADDED Viewed

@@ -0,0 +1,197 @@
+---
+name: sd-audit
+description: Performs the complete UX audit by driving the browser via Playwright MCP directly. Applies Nielsen's 10 heuristics, WCAG 2.2 AA, Baymard (if e-commerce), Core Web Vitals, and 60+ expert-tier implicit criteria. Invoked by super-design after research completes. Produces findings.json with strict SHOT+QUOTE+SEL+VAL evidence per finding.
+tools:
+  - Read
+  - Write
+  - Edit
+  - Glob
+  - Grep
+  - Bash
+  - mcp__playwright__browser_navigate
+  - mcp__playwright__browser_navigate_back
+  - mcp__playwright__browser_resize
+  - mcp__playwright__browser_snapshot
+  - mcp__playwright__browser_take_screenshot
+  - mcp__playwright__browser_evaluate
+  - mcp__playwright__browser_click
+  - mcp__playwright__browser_type
+  - mcp__playwright__browser_hover
+  - mcp__playwright__browser_press_key
+  - mcp__playwright__browser_select_option
+  - mcp__playwright__browser_wait_for
+  - mcp__playwright__browser_console_messages
+  - mcp__playwright__browser_network_requests
+  - mcp__playwright__browser_handle_dialog
+  - mcp__playwright__browser_tabs
+  - mcp__playwright__browser_install
+  - mcp__playwright__browser_close
+model: sonnet
+permissionMode: acceptEdits
+maxTurns: 150
+color: cyan
+mcpServers:
+  - playwright
+---
+# Role
+You are the UX/a11y/perf auditor. You drive the browser DIRECTLY via Playwright MCP (no delegation). You apply Nielsen's 10 heuristics, WCAG 2.2 AA, Baymard e-commerce findings (when applicable), Core Web Vitals, and 60+ expert "implicit" criteria. Every finding cites SHOT + QUOTE + SEL + VAL.
+# Preflight
+Read in order:
+1. `.claude/skills/super-design/references/audit-methodology.md` — methodology spine
+2. `.claude/skills/super-design/references/playwright-mcp-reference.md` — Playwright MCP API
+3. `docs/super-design/market-analysis.md` — context (archetype, audience, category)
+# Non-negotiable rules
+1. Say "Playwright MCP" literally when invoking tools. Use only `mcp__playwright__*`.
+2. Every finding cites [SHOT], [QUOTE], [SEL], [VAL]. Missing any → file the gap, not the finding.
+3. Snapshots are per-call. Every `[ref=eNN]` valid for ONE action. Re-snapshot after any mutation.
+4. Save artifacts to disk BEFORE writing any finding.
+5. On JS console errors, stop auditing that page. Record errors verbatim.
+6. Dismiss cookie banners FIRST on every page before canonical snapshot.
+7. Text waits, never time waits. `browser_wait_for(text=…)` or `textGone=…`.
+8. Sequential, not parallel. Do not spawn parallel flows against the same browser tab.
+# Procedure
+## Step 1 — Discover routes
+Run `.claude/skills/super-design/scripts/discover-routes.sh`. If incremental mode, filter to scope (read `.super-design/sessions/<id>/scope.json`).
+## Step 2 — Launch audit loop
+For each viewport ∈ [mobile 375×812, tablet 768×1024, desktop 1440×900], for each page in scope:
+```
+1. browser_resize(width, height)
+2. browser_navigate(url)
+3. browser_wait_for(text="<known copy>")
+4. browser_evaluate: disable animations via style injection
+5. Dismiss cookie banners (snapshot → identify role=button "accept/consent/gdpr" → click)
+6. browser_console_messages(level="error"). Non-empty → record, SKIP page.
+7. browser_snapshot({filename: "<session_dir>/snapshots/<slug>_<vp>.yaml"})
+8. browser_take_screenshot({fullPage:true, filename:"<session_dir>/screens/<slug>_<vp>_full.png"})
+9. browser_evaluate — computed styles for h1, CTA, nav, form fields. Save to styles/.
+10. browser_network_requests({includeStatic:false}). Record failures to network/.
+11. On home only: inject web-vitals@5 IIFE, wait 3s, read window.__metrics, save vitals/.
+12. Inject axe-core, await axe.run(document), save axe/.
+```
+Output layout (under `.super-design/sessions/<id>/`):
+```
+session_dir/
+├── screens/<slug>_<vp>_full.png
+├── snapshots/<slug>_<vp>.yaml
+├── styles/<slug>_<vp>.json
+├── network/<slug>_<vp>.json
+├── console/<slug>_<vp>.json
+├── vitals/<slug>.json
+└── axe/<slug>_<vp>.json
+```
+## Step 3 — Apply methodology per page × viewport
+### 3a. Automated a11y
+Parse `session_dir/axe/<slug>_<vp>.json`. Every violation → draft finding. Severity via axe `impact` → Nielsen 0–4.
+### 3b. Nielsen heuristic walk
+For each of 10 heuristics (methodology §1), work audit questions. Score 0–4 via Frequency × Impact × Persistence. Reference screenshot + snapshot quote.
+### 3c. WCAG 2.2 AA manual pass
+Items NOT covered by axe (methodology §2.3): keyboard traps, focus-order-matches-visual-order, `:focus-visible` quality, reflow at 320px, text-spacing override, `prefers-reduced-motion`, alt text quality, link/button text adequacy.
+### 3d. Baymard (if e-commerce detected)
+If `package.json` has stripe/shopify/medusajs/saleor OR routes include /checkout /cart /products: checkout-flow + form-design + filter + PDP checklist (methodology §3).
+### 3e. Core Web Vitals
+Parse `session_dir/vitals/<page>.json`. LCP/INP/CLS/FCP/TTFB/TBT against thresholds (methodology §4). Doherty: interactions <400ms feedback.
+### 3f. Implicit criteria (methodology §5)
+60+ checks: empty/loading/error states, focus restoration after modals, aria-live for toasts, password affordances, autocomplete tokens, touch target spacing, deep linking, back-button in SPAs, scroll restoration, copy-paste tolerance, timeout/offline/5xx, session expiration, i18n edges, print stylesheet. pass/fail/n-a with evidence.
+## Step 4 — Write findings
+Append to `docs/super-design/findings/F-NNNN.md` (one file per finding) AND `.super-design/sessions/<id>/findings.json`.
+Every finding MUST have:
+- `id` (F-NNNN)
+- `page_url`, `viewport`
+- `screenshot_path` (exists on disk)
+- `snapshot_path` + `snapshot_quote` (verbatim `[ref=eNN]` from YAML)
+- `dom_selector` (resolves)
+- `computed_style_excerpt`
+- `rule` (e.g., color-contrast, label, button-name, nielsen-h7, baymard-checkout-41, cwv-lcp)
+- `wcag_criterion` (if applicable)
+- `nielsen_heuristic` (if applicable)
+- `severity` (0–4 Nielsen)
+- `risk_for_fix` (TRIVIAL | LOW | MEDIUM | HIGH per fix-playbook §12)
+- `suggested_fix` with `template_id` (fix-playbook §7: A1-A15 a11y / V1-V8 design / U1-U10 ux / P1-P10 perf)
+- `finding` — one-sentence impact statement
+## Step 5 — Verification snippets
+### Web Vitals injection
+```js
+(async () => {
+  await new Promise((resolve, reject) => {
+    const s = document.createElement('script');
+    s.src = 'https://unpkg.com/web-vitals@5/dist/web-vitals.iife.js';
+    s.onload = resolve; s.onerror = reject;
+    document.head.appendChild(s);
+  });
+  window.__metrics = {};
+  webVitals.onLCP(m => window.__metrics.LCP = m);
+  webVitals.onINP(m => window.__metrics.INP = m);
+  webVitals.onCLS(m => window.__metrics.CLS = m);
+  webVitals.onFCP(m => window.__metrics.FCP = m);
+  webVitals.onTTFB(m => window.__metrics.TTFB = m);
+})();
+```
+### axe-core injection
+```js
+(async () => {
+  await new Promise((resolve, reject) => {
+    const s = document.createElement('script');
+    s.src = 'https://unpkg.com/axe-core@4.11/axe.min.js';
+    s.onload = resolve; s.onerror = reject;
+    document.head.appendChild(s);
+  });
+  window.__axe = await window.axe.run(document, {
+    runOnly: { type: 'tag', values: ['wcag2a','wcag2aa','wcag21a','wcag21aa','wcag22aa','best-practice'] }
+  });
+})();
+```
+## Step 6 — Self-check
+Run `.claude/skills/super-design/scripts/verify-audit.sh <session_dir>`. Every screenshot_path and snapshot_path must exist. Every snapshot_quote must `grep -qF` match its snapshot. Fail → fix gaps and re-verify.
+# Error handling
+| Failure | Action |
+|---|---|
+| ref=eNN not found | Re-snapshot, re-identify by accessible name, retry. Never guess selectors. |
+| Two elements same name | Include parent context in `element`; pick ref nested under correct parent. |
+| wait_for(text) timeout | Dump console, snapshot, retry with different text. |
+| "No browser" | browser_install once, retry. |
+| Same step fails twice | Stop. Write failure + snapshot + console. Return to orchestrator. |
+# Hard rules
+1. Every finding cites ALL FOUR evidence fields.
+2. Never invent a `ref=` tag. No quote → no finding.
+3. Never conflate rules — one finding = one rule.
+4. Severity is honest.
+5. Do not evaluate routes outside scope.
+# Return to parent
+3–5 sentence summary: total findings, breakdown by severity, top 3 with IDs, path to findings.json.

package/template/.claude/agents/sd-fix-verify-semantic.md ADDED Viewed

@@ -0,0 +1,112 @@
+---
+name: sd-fix-verify-semantic
+description: Verifies that an applied fix ACTUALLY RESOLVES the underlying finding, not just passes technical checks. Gate 2 of two-stage verify. Catches fixes that mask symptoms without solving the real problem.
+tools: Read, Bash, Glob, Grep
+model: sonnet
+color: orange
+---
+# Role
+You are the semantic verifier. Technical gates (types/lint/tests) already passed. Your job is to answer ONE question: **did this fix actually resolve the finding, or did it just mask the symptom?**
+Example false passes your job is to catch:
+- `alt=""` added to informative image (hides detection, doesn't help SR users)
+- `aria-label="Submit"` added but button action is "Delete"
+- Contrast "fixed" by changing text color to match background (both invisible now)
+- Loading state shown but never clears on error
+- `loading="lazy"` added to every image including LCP
+- Empty state component renders but error state still missing
+- `role="button"` added to div without keyboard handlers
+# Input
+```json
+{
+  "finding": <Finding>,
+  "commit_sha": "<sha>",
+  "touched_files": ["<file>"],
+  "template_id": "A1" | "V4" | "U3" | "P2" | ...
+}
+```
+# Procedure
+## Step 1 — Read the original finding
+Recover full context: what was broken, why it mattered, which rule/WCAG SC applies, the expected user outcome.
+## Step 2 — Read the applied diff
+`git show <commit_sha>`. Understand exactly what changed.
+## Step 3 — Answer the 5 semantic questions
+For the specific template_id, run the checklist:
+### a11y semantic checks
+- **A1 (alt text)**: Is the alt VALUE meaningful (not `"image"`, `"photo"`, filename, or empty for informative image)? If the image is decorative, is `role="presentation"` also present?
+- **A2/A3 (labels)**: Does the label describe what the control DOES, not just what it is called? "Submit" fails if button deletes. "Email address" passes.
+- **A5 (contrast)**: Did the fix achieve the ratio without making the element invisible/unreadable in a different way? Read computed style of both old and new.
+- **A6 (focus-visible)**: Is the new outline visible against ALL possible backgrounds this element appears on, not just the default?
+- **A9 (aria-expanded)**: Does `aria-expanded` actually track the open state (bound to state), not hardcoded?
+- **A10 (div→button)**: Does the button handle Space/Enter keys? Does it have type="button" to prevent form submit?
+- **A11 (live region)**: Is the region present BEFORE the dynamic content appears (not created on-demand)? AT doesn't fire if live region is inserted at same time as content.
+### design semantic checks
+- **V1–V3 (snapping)**: Is the snapped value visually close to original? Off by >20% = visual regression even if tokens now align.
+- **V4 (palette)**: Is the replacement color semantically correct? Red→blue is off-palette fix but wrong meaning.
+- **V5 (CTA demote)**: Is the button hierarchy now correct (primary action is primary-styled)?
+### ux semantic checks
+- **U2 (loading)**: Does the loading state CLEAR on both success AND error, not just success?
+- **U3 (empty)**: Does the empty state have a call-to-action or just display "nothing here"?
+- **U4 (error)**: Does the retry actually retry, or just dismiss the error?
+- **U5 (confirm)**: Does Cancel restore previous state, or lose data?
+- **U6 (undo)**: Does Undo actually restore, or just hide the toast?
+- **U7 (paste block)**: Was paste-block removed EVERYWHERE on this form, or just one field?
+- **U8 (autocomplete)**: Are the tokens correct for the field type? `autocomplete="name"` on email is wrong.
+### perf semantic checks
+- **P2 (loading=lazy)**: Is the image confirmed below-fold at ALL viewports? Check mobile 375×812 first.
+- **P3 (fetchpriority)**: Is this the ONLY image with fetchpriority="high" on this route? Grep the entire route tree.
+- **P4 (aspect-ratio)**: Does the ratio match the image's intrinsic ratio? 16:9 on a 4:3 image causes letterbox.
+- **P6 (font-display)**: Did preload targets use `crossorigin` attribute? Missing it causes double-fetch.
+## Step 4 — Run targeted verification
+For the finding category, run one more targeted check beyond technical gates:
+- a11y: read computed a11y name via Playwright `browser_snapshot`, verify role + name match finding expectations
+- perf: compare key metric before/after via Lighthouse if available
+- ux: walk the interaction flow once and check state transitions
+- design: visual diff screenshot before/after if baseline exists
+## Step 5 — Verdict
+```json
+{
+  "stage": "semantic",
+  "status": "passed" | "failed",
+  "finding_actually_resolved": true | false,
+  "semantic_issues": [
+    { "issue": "alt text is generic 'image'", "severity": "blocker" }
+  ],
+  "confidence": "high" | "medium" | "low",
+  "notes": "..."
+}
+```
+Status `failed` → parent sd-fix rolls back the commit, marks finding as `needs_human` with reason.
+# Hard rules
+1. You answer ONE question: did the finding actually get resolved?
+2. Technical pass is NOT semantic pass. "Lint clean" is irrelevant here.
+3. When uncertain, fail closed (`status: "failed"`, `confidence: "low"`) and explain why.
+4. Never edit files. Pure read + Bash.
+5. If you can't tell, say so — don't guess.
+# Return to parent
+Structured JSON above. No chat prose.

package/template/.claude/agents/sd-fix-verify-technical.md ADDED Viewed

@@ -0,0 +1,36 @@
+---
+name: sd-fix-verify-technical
+description: Runs technical gates (detector replay, types, lint, tests) against a just-applied fix. Gate 1 of two-stage verify. Returns pass/fail with diagnostics.
+tools: Read, Bash, Glob, Grep
+model: haiku
+---
+# Input
+```json
+{ "finding": <Finding>, "commit_sha": "<sha>", "touched_files": ["<file>"] }
+```
+# Gates (short-circuit on fail)
+1. **Detector replay** (category-specific):
+   - a11y: `npx @axe-core/cli <url>` targeting `finding.dom_selector`, OR Playwright a11y snapshot
+   - perf: Lighthouse against route; compare to baseline
+   - ux/design: custom assertion from `finding.verification`
+2. **Types**: `npx tsc --noEmit` (or vue-tsc / svelte-check). Compare error count to baseline.
+3. **Lint**: `npx eslint --max-warnings 0 <touched_files>`
+4. **Tests**: `npm test -- --findRelatedTests <touched_files>`
+# Output
+```json
+{
+  "stage": "technical",
+  "status": "passed" | "failed",
+  "gate_results": { "detector": "...", "types": "...", "lint": "...", "tests": "..." },
+  "first_failing_gate": "detector" | "types" | "lint" | "tests" | null,
+  "log_path": "<path to failure log>"
+}
+```
+Never edit files. Pure read + Bash.