npm - start-vibing - Versions diffs - 4.4.9 → 4.4.11 - Mend

start-vibing 4.4.9 → 4.4.11

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

package/package.json +1 -1
package/template/.claude/agents/research-scout.md +3 -2
package/template/.claude/hooks/research-session-start.sh +1 -1
package/template/.claude/skills/research/SKILL.md +10 -11
package/template/.claude/skills/super-design/SKILL.md +60 -57

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
 	"name": "start-vibing",
-	"version": "4.4.9",
+	"version": "4.4.11",
 	"description": "Setup Claude Code with 9 plugins, 6 community skills, and 8 MCP servers. Parallel install, auto-accept, superpowers + ralph-loop. e2e-audit 0.2.0 refactor (skill-only, no agents): SessionStart hook + slash command make the skill keyword-invokable (\"e2e audit\", \"roda o e2e\", \"integration test\", \"test coverage gaps\"). Source-first discovery via detect-stack, discover-routes (Next app/pages/Remix/SvelteKit/Nuxt/Astro), discover-api-surface (HTTP handlers, tRPC procedures, GraphQL, server actions, middleware auth), inventory-existing-tests (preserve prior corpus + sha256 drift hash), and detect-uncovered (branch-diff vs origin/main finds changes not covered by existing specs). Report-then-ask between mapping and Playwright run; post-run-feedback report before writing findings. SHOT+TRACE+ASSERT+SOURCE evidence quad per non-meta finding; meta rules (coverage-gap-*, uncovered-*, test-drift, stack-detect, post-run-feedback) exempt. verify-audit.sh enforces schema + quad. Generic (no project leakage). super-design 0.7.0 carries over.",
 	"type": "module",
 	"bin": {

package/template/.claude/agents/research-scout.md CHANGED Viewed

@@ -1,6 +1,6 @@
 ---
 name: research-scout
-description: MUST BE USED at the start of every research run to produce scout-plan.json. Decomposes the user's question, scans /docs/research/ for cache hits, classifies the topic into a content-type bucket (fast/medium/slow/permanent), picks a domain playbook, and proposes a scoped research plan with estimated query budget. Stops before any web query so the orchestrator can run the report-then-ask gate.
+description: MUST BE USED at the start of every research run to produce scout-plan.json. Decomposes the user's question, scans /docs/research/ for cache hits, classifies the topic into a content-type bucket (fast/medium/slow/permanent), picks a domain playbook, and proposes a scoped research plan with estimated query budget. Returns scout-plan.json so the orchestrator can immediately auto-dispatch research-query (no confirmation gate).
 tools: Read, Write, Glob, Grep, Bash
 model: haiku
 color: cyan
@@ -113,7 +113,8 @@ Write to `$SESSION_DIR/scout-plan.json`:
 Return to the orchestrator a short text with: slug, decomposition count,
 estimated queries, cache strategy, and any blockers. The orchestrator
-will run the report-then-ask gate with the user.
+prints a one-line summary and immediately dispatches research-query (no
+confirmation gate — user can interrupt mid-run).
 # Hard rules

package/template/.claude/hooks/research-session-start.sh CHANGED Viewed

@@ -1,4 +1,4 @@
 #!/usr/bin/env bash
 cat <<'EOF'
-{"hookSpecificOutput":{"hookEventName":"SessionStart","additionalContext":"When the user mentions research, investigate, find info, search for, look up, evaluate, compare, audit literature, competitor analysis, market research, library evaluation, prior art, pesquisar, pesquisa, pesquise, investigar, buscar info, procurar info, comparar, avaliar biblioteca, análise de mercado, análise de concorrentes — or asks to evaluate/compare/understand any technology, framework, vendor, methodology, or domain — you MUST invoke the research skill. Do not improvise a search plan, do not start WebSearch blind, do not skip the cache check. Read .claude/skills/research/SKILL.md first, then follow its entry flow precisely. The skill uses source-first scout (cache check + scoped plan) BEFORE touching the web, and emits a report-then-ask gate after scouting. Every non-meta claim must carry the URL+QUOTE+ACCESSED-AT+VERIFY-METHOD evidence quad. Output goes to /docs/research/<topic>.md, never to .claude/skills/research-cache/. Triangulation follows Denzin's 4 types — republication chains count as one source, not three. Content-type freshness has 4 buckets (fast/medium/slow/permanent); do not apply fast-bucket aging to slow-bucket topics. Hand off to super-design for UX audits and e2e-audit for test audits — research does not replace either."}}
+{"hookSpecificOutput":{"hookEventName":"SessionStart","additionalContext":"When the user mentions research, investigate, find info, search for, look up, evaluate, compare, audit literature, competitor analysis, market research, library evaluation, prior art, pesquisar, pesquisa, pesquise, investigar, buscar info, procurar info, comparar, avaliar biblioteca, análise de mercado, análise de concorrentes — or asks to evaluate/compare/understand any technology, framework, vendor, methodology, or domain — you MUST invoke the research skill. Do not improvise a search plan, do not start WebSearch blind, do not skip the cache check. Read .claude/skills/research/SKILL.md first, then follow its entry flow precisely. The skill uses source-first scout (cache check + scoped plan) BEFORE touching the web, then auto-dispatches research-query immediately (no confirmation gate — user can interrupt mid-run). Every non-meta claim must carry the URL+QUOTE+ACCESSED-AT+VERIFY-METHOD evidence quad. Output goes to /docs/research/<topic>.md, never to .claude/skills/research-cache/. Triangulation follows Denzin's 4 types — republication chains count as one source, not three. Content-type freshness has 4 buckets (fast/medium/slow/permanent); do not apply fast-bucket aging to slow-bucket topics. Hand off to super-design for UX audits and e2e-audit for test audits — research does not replace either."}}
 EOF

package/template/.claude/skills/research/SKILL.md CHANGED Viewed

@@ -27,7 +27,8 @@ Four-phase pipeline with 4 specialist agents:
 1. **Scout** (research-scout, Haiku) — decomposes the user's question, checks
    `/docs/research/` for existing fresh findings (content-type-calibrated
    freshness — fast/medium/slow/permanent buckets), proposes a scoped research
-   plan + estimated query budget. Stops here for `report-then-ask` gate.
+   plan + estimated query budget. Hands directly to Query — NO confirmation
+   gate (auto-proceed). User can interrupt mid-run if needed.
 2. **Query** (research-query, Sonnet) — executes web/library searches in
    parallel, fetches pages via WebFetch + context7 (for library docs),
    extracts atomic claims with URL+QUOTE+ACCESSED-AT evidence, dumps to
@@ -84,20 +85,18 @@ Pass `cache-check.json` + the user question. Scout returns
 }
 ```
-### Step 3 — Report-then-ask (HARD STOP)
+### Step 3 — Auto-proceed (no confirmation gate)
-Present a ≤6-line summary to the user:
+Print a ≤4-line summary for visibility, then immediately dispatch Query.
+Do NOT wait for user confirmation. The user can `/cancel` or interrupt
+mid-run if scope is wrong.
 ```
-Topic: react-server-components-data-fetching
-Domain: software-engineering · Playbook: library-evaluation
-Plan: 3 sub-questions, ~12 queries, ~8 min
-Cache: delta-update of /docs/research/react-server-components-data-fetching.md (47d old)
-Proceed? (y / scope-down / cancel)
+Topic: <slug> · Plan: <N> sub-questions, ~<Q> queries, ~<M> min
+Cache: <strategy> · Proceeding to query...
 ```
-Do NOT proceed to Step 4 without an explicit reply. Mirror the e2e-audit
-report-then-ask gate.
+Exception: if `--dry-run` flag is set, stop here and return scout-plan.json.
 ### Step 4 — Query (Task tool → research-query)
@@ -253,7 +252,7 @@ fingerprints + ownership trees.
 ## Hard rules
 1. **Cache first**. Never burn query tokens on a fresh cache hit.
-2. **Report-then-ask**. After scout, STOP for user confirmation before query.
+2. **Auto-proceed after scout**. Print summary, immediately dispatch query. NO confirmation gate. User can interrupt mid-run.
 3. **Every claim has URL+QUOTE+ACCESSED-AT+VERIFY-METHOD**. Verify gate fails closed on violations.
 4. **No fabricated citations, ever**. If a quote cannot be greppped in the fetched page, the claim is dropped.
 5. **Triangulate by Denzin type, not raw count**. 3 republications of one wire story = 1 source.

package/template/.claude/skills/super-design/SKILL.md CHANGED Viewed

@@ -1,13 +1,13 @@
 ---
 name: super-design
 description: >
-  Performs end-to-end design audits on web projects. MUST BE USED when the user
-  mentions super-design, design audit, UX review, accessibility review, design
-  critique, competitor analysis, WCAG, Core Web Vitals, usability audit, or asks
-  to evaluate their website's design quality. Produces market analysis, live-site
-  UX audit (WCAG 2.2 AA, Nielsen heuristics, Baymard, CWV), and synthesized
-  overview. Re-audits only what changed since last run. On explicit user request,
-  applies surgical fixes with full rollback.
+    Performs end-to-end design audits on web projects. MUST BE USED when the user
+    mentions super-design, design audit, UX review, accessibility review, design
+    critique, competitor analysis, WCAG, Core Web Vitals, usability audit, or asks
+    to evaluate their website's design quality. Produces market analysis, live-site
+    UX audit (WCAG 2.2 AA, Nielsen heuristics, Baymard, CWV), and synthesized
+    overview. Re-audits only what changed since last run. On explicit user request,
+    applies surgical fixes with full rollback.
 version: 0.7.0
 ---
@@ -23,41 +23,41 @@ Four-phase pipeline with 6 specialist agents:
    produces market-analysis.md + component-comparison.md.
 2. **UI/UX audit** (sd-audit) — drives browser via Playwright MCP directly.
    Six layers:
-   - Route discovery + static snap (Nielsen + WCAG 2.2 AA + Baymard + CWV)
-   - **Step 1.5 source-first discovery** (0.7.0+) —
-     `discover-surfaces.sh` reads the repo FIRST and emits an authoritative
-     inventory of modals, forms, triggers, internal nav, and Next.js
-     layout/error/loading/not-found/parallel/intercepting routes BEFORE
-     Playwright runs. `extract-project-rules.sh` parses FORBIDDEN tables
-     from CLAUDE.md / AGENTS.md / .cursorrules into audit-applicable
-     rules. Runtime cross-checks surface these as `modal-coverage-gap`,
-     `form-coverage-gap`, and `project-forbidden-<slug>` findings.
-   - **Step 2.5 component/modal/flow discovery** (Phase A inventory, B modal
-     enumeration, C flow exercising, D state matrix, E form coverage) — this
-     is where modal contents, empty/loading/error states, and flow errors
-     get real evidence instead of "checklist hypothetical". Phase B now
-     cross-references `surfaces.json` and files a `modal-coverage-gap`
-     finding for anything declared in source but never opened.
-   - **Step 3g design-intelligence scoring** (17-category rubric → DIS 0–100)
-     catches implicit best practices checklists miss (cards-in-flex-col,
-     low density, weak CTA hierarchy, vibecode smell). Emits MANDATORY
-     `design-intelligence-craft-summary` finding per page × viewport so
-     overview.md has one holistic verdict row ("admin mobile is 38/100
-     WEAK — holistic redesign scope") per combination, not just discrete
-     per-category findings.
-   - **Step 3h mobile-native audit** (21-item Duolingo/Linear/Arc/Cash-App
-     checklist) — replaces "responsive-web-on-a-phone" thinking.
-   - **Step 3i project-rule enforcement** (0.7.0+) — consumes
-     `project-rules.json` and fires primary findings keyed to the
-     project's own FORBIDDEN wording (e.g. `project-forbidden-use-cards-on-mobile`)
-     when the rule is violated at runtime. Not a tag, not a severity
-     bump — the project owner's rule IS the rule source.
-   - C16 ≤ 4 → **DSC-choice** proposal: sd-synthesis runs
-     `scripts/score-typeui.mjs --from-audit <dir>` to derive a 7-axis site
-     fingerprint (density/contrast/geometry/color/typography/motion/audience)
-     and rank all installed typeui-* skills by fit 0-1. Top-3 are written to
-     `typeui-proposal.json` + shown in overview.md with a copy-paste CLI.
-   Produces findings.json + design-intelligence.json with SHOT+QUOTE+SEL+VAL.
+    - Route discovery + static snap (Nielsen + WCAG 2.2 AA + Baymard + CWV)
+    - **Step 1.5 source-first discovery** (0.7.0+) —
+      `discover-surfaces.sh` reads the repo FIRST and emits an authoritative
+      inventory of modals, forms, triggers, internal nav, and Next.js
+      layout/error/loading/not-found/parallel/intercepting routes BEFORE
+      Playwright runs. `extract-project-rules.sh` parses FORBIDDEN tables
+      from CLAUDE.md / AGENTS.md / .cursorrules into audit-applicable
+      rules. Runtime cross-checks surface these as `modal-coverage-gap`,
+      `form-coverage-gap`, and `project-forbidden-<slug>` findings.
+    - **Step 2.5 component/modal/flow discovery** (Phase A inventory, B modal
+      enumeration, C flow exercising, D state matrix, E form coverage) — this
+      is where modal contents, empty/loading/error states, and flow errors
+      get real evidence instead of "checklist hypothetical". Phase B now
+      cross-references `surfaces.json` and files a `modal-coverage-gap`
+      finding for anything declared in source but never opened.
+    - **Step 3g design-intelligence scoring** (17-category rubric → DIS 0–100)
+      catches implicit best practices checklists miss (cards-in-flex-col,
+      low density, weak CTA hierarchy, vibecode smell). Emits MANDATORY
+      `design-intelligence-craft-summary` finding per page × viewport so
+      overview.md has one holistic verdict row ("admin mobile is 38/100
+      WEAK — holistic redesign scope") per combination, not just discrete
+      per-category findings.
+    - **Step 3h mobile-native audit** (21-item Duolingo/Linear/Arc/Cash-App
+      checklist) — replaces "responsive-web-on-a-phone" thinking.
+    - **Step 3i project-rule enforcement** (0.7.0+) — consumes
+      `project-rules.json` and fires primary findings keyed to the
+      project's own FORBIDDEN wording (e.g. `project-forbidden-use-cards-on-mobile`)
+      when the rule is violated at runtime. Not a tag, not a severity
+      bump — the project owner's rule IS the rule source.
+    - C16 ≤ 4 → **DSC-choice** proposal: sd-synthesis runs
+      `scripts/score-typeui.mjs --from-audit <dir>` to derive a 7-axis site
+      fingerprint (density/contrast/geometry/color/typography/motion/audience)
+      and rank all installed typeui-\* skills by fit 0-1. Top-3 are written to
+      `typeui-proposal.json` + shown in overview.md with a copy-paste CLI.
+      Produces findings.json + design-intelligence.json with SHOT+QUOTE+SEL+VAL.
 3. **Synthesis** (sd-synthesis) — unifies research + audit + design-intelligence
    into overview.md (per-page DIS table + executive summary + typeui proposal
    when craft ≤ 4).
@@ -67,7 +67,7 @@ Four-phase pipeline with 6 specialist agents:
    A1-A15 a11y · V1-V8 design · U1-U10 ux · P1-P10 perf · **M1-M15 mobile**
    (cards-in-flex-col → compact list, table-on-mobile → card-per-row,
    centered-modal → bottom-sheet, etc.) · **DSC-1 design-skill advisory**
-   (proposes typeui-* direction). With `--typeui <name>` flag, sd-fix loads
+   (proposes typeui-\* direction). With `--typeui <name>` flag, sd-fix loads
    the chosen typeui skill (tokens: primary, radius, spacing scale,
    typography) and rebrands V1-V8 fixes so every spacing/color/radius target
    snaps to that direction's scale — turning advisory into applied. After each
@@ -91,6 +91,7 @@ fi
 ### Step 2: Scope decision (if incremental)
 Apply cascade from `references/change-detection-playbook.md` §4:
 - Run `scripts/detect-changes.sh $LAST_SHA` → classify changed files.
 - theory_doc_sha changed OR tokens changed OR major dep bump OR >180d old → FULL.
 - Only components changed → re-audit pages importing them (N=3 hops via madge).
@@ -109,6 +110,8 @@ Record scope in overview.md changelog banner.
 4. sd-fix         (ONLY if user asked; uses verify-technical + verify-semantic)
 ```
+**MCP serialization (HARD RULE):** Dispatch agents in this order **strictly sequentially** — wait for each agent to fully return before launching the next. NEVER launch two Playwright-MCP-using agents in the same message or while another is still running. Playwright MCP holds a single shared browser session; concurrent agents collide on the tab and the run bugs out (snapshots interleave, navigations race, sessions dangle). MCP-using agents that MUST be serialized: `sd-research`, `sd-audit`, `sd-fix`, `sd-fix-verify-semantic`. The only agent safe to run alongside (no MCP) is `sd-synthesis`, but per the order above it runs after the MCP agents anyway.
 Pass findings via files under `.super-design/sessions/<id>/`, not chat.
 ### Step 4: Write state + history
@@ -143,7 +146,7 @@ Do NOT paste overview into chat.
   block. Without this flag, DSC-1 stays advisory. Run
   `bash scripts/score-typeui.mjs --list` to see all installed directions.
 - `--harvest-typeui` — run `scripts/harvest-typeui.sh` first to pull
-  missing typeui-* and related design skills (typeui.sh registry with
+  missing typeui-\* and related design skills (typeui.sh registry with
   built-in catalog fallback). Idempotent; add `--refresh` to re-download.
 - `--dry-run` — artifacts without committing state
 - `--ci` — non-interactive, create PR, exit non-zero on blockers
@@ -161,13 +164,13 @@ is auto-detected; nothing else to configure.
 `scripts/detect-apps.sh` reads the first workspace manifest it finds:
-| Manifest | Source of globs |
-|----------|-----------------|
-| `pnpm-workspace.yaml` | `packages:` list |
-| `package.json` | `workspaces: [...]` or `workspaces.packages: [...]` (npm, yarn, Bun) |
-| `turbo.json` | Presence → uses pnpm/npm/yarn workspaces; falls back to `apps/*` + `packages/*` if none |
-| `nx.json` | `workspaceLayout.appsDir` / `libsDir` (default `apps/*`, `libs/*`) |
-| `bunfig.toml` | Presence → falls back to `apps/*` + `packages/*` if package.json has no workspaces |
+| Manifest              | Source of globs                                                                         |
+| --------------------- | --------------------------------------------------------------------------------------- |
+| `pnpm-workspace.yaml` | `packages:` list                                                                        |
+| `package.json`        | `workspaces: [...]` or `workspaces.packages: [...]` (npm, yarn, Bun)                    |
+| `turbo.json`          | Presence → uses pnpm/npm/yarn workspaces; falls back to `apps/*` + `packages/*` if none |
+| `nx.json`             | `workspaceLayout.appsDir` / `libsDir` (default `apps/*`, `libs/*`)                      |
+| `bunfig.toml`         | Presence → falls back to `apps/*` + `packages/*` if package.json has no workspaces      |
 Each matched directory that also has a `package.json` becomes an app
 with `name` taken from `package.json#name` (scope stripped), `path` the
@@ -231,14 +234,14 @@ Windows git-bash + Linux.
   (`{nodes, edges, hash, backend}`) and persists `import_graph_sha` to
   state. Prefers `npx madge --json <roots>`; falls back to a regex
   scanner (JS/TS only, no alias resolution) if madge is missing.
-  - Query: `bash .../build-import-graph.sh importers <file> --hops 3`
-    → BFS over reversed edges; `detect-changes.sh` uses this to close
-    the component→pages gap when only components changed (Step 2 scope
-    decision: "Only components changed → re-audit pages importing them
-    (N=3 hops via madge)").
+    - Query: `bash .../build-import-graph.sh importers <file> --hops 3`
+      → BFS over reversed edges; `detect-changes.sh` uses this to close
+      the component→pages gap when only components changed (Step 2 scope
+      decision: "Only components changed → re-audit pages importing them
+      (N=3 hops via madge)").
 - `hash-pages.sh` — captures 3 viewports per URL (mobile_375, tablet_768,
   desktop_1280), emits `{html_hash, dom_structure_hash, viewport_hashes:
-  {<vp>: {sha256, phash, png_path}}}` per page to
+{<vp>: {sha256, phash, png_path}}}` per page to
   `docs/super-design/.cache/hashes/hashes.json` and persists each PNG to
   `<cache>/screenshots/<url-enc>/<vp>.png`. Applies artifact §3.4 mask
   defaults plus `MASK_SELECTORS`; `phash` uses `sharp` when available