start-vibing 4.4.9 → 4.4.11

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "start-vibing",
3
- "version": "4.4.9",
3
+ "version": "4.4.11",
4
4
  "description": "Setup Claude Code with 9 plugins, 6 community skills, and 8 MCP servers. Parallel install, auto-accept, superpowers + ralph-loop. e2e-audit 0.2.0 refactor (skill-only, no agents): SessionStart hook + slash command make the skill keyword-invokable (\"e2e audit\", \"roda o e2e\", \"integration test\", \"test coverage gaps\"). Source-first discovery via detect-stack, discover-routes (Next app/pages/Remix/SvelteKit/Nuxt/Astro), discover-api-surface (HTTP handlers, tRPC procedures, GraphQL, server actions, middleware auth), inventory-existing-tests (preserve prior corpus + sha256 drift hash), and detect-uncovered (branch-diff vs origin/main finds changes not covered by existing specs). Report-then-ask between mapping and Playwright run; post-run-feedback report before writing findings. SHOT+TRACE+ASSERT+SOURCE evidence quad per non-meta finding; meta rules (coverage-gap-*, uncovered-*, test-drift, stack-detect, post-run-feedback) exempt. verify-audit.sh enforces schema + quad. Generic (no project leakage). super-design 0.7.0 carries over.",
5
5
  "type": "module",
6
6
  "bin": {
@@ -1,6 +1,6 @@
1
1
  ---
2
2
  name: research-scout
3
- description: MUST BE USED at the start of every research run to produce scout-plan.json. Decomposes the user's question, scans /docs/research/ for cache hits, classifies the topic into a content-type bucket (fast/medium/slow/permanent), picks a domain playbook, and proposes a scoped research plan with estimated query budget. Stops before any web query so the orchestrator can run the report-then-ask gate.
3
+ description: MUST BE USED at the start of every research run to produce scout-plan.json. Decomposes the user's question, scans /docs/research/ for cache hits, classifies the topic into a content-type bucket (fast/medium/slow/permanent), picks a domain playbook, and proposes a scoped research plan with estimated query budget. Returns scout-plan.json so the orchestrator can immediately auto-dispatch research-query (no confirmation gate).
4
4
  tools: Read, Write, Glob, Grep, Bash
5
5
  model: haiku
6
6
  color: cyan
@@ -113,7 +113,8 @@ Write to `$SESSION_DIR/scout-plan.json`:
113
113
 
114
114
  Return to the orchestrator a short text with: slug, decomposition count,
115
115
  estimated queries, cache strategy, and any blockers. The orchestrator
116
- will run the report-then-ask gate with the user.
116
+ prints a one-line summary and immediately dispatches research-query (no
117
+ confirmation gate — user can interrupt mid-run).
117
118
 
118
119
  # Hard rules
119
120
 
@@ -1,4 +1,4 @@
1
1
  #!/usr/bin/env bash
2
2
  cat <<'EOF'
3
- {"hookSpecificOutput":{"hookEventName":"SessionStart","additionalContext":"When the user mentions research, investigate, find info, search for, look up, evaluate, compare, audit literature, competitor analysis, market research, library evaluation, prior art, pesquisar, pesquisa, pesquise, investigar, buscar info, procurar info, comparar, avaliar biblioteca, análise de mercado, análise de concorrentes — or asks to evaluate/compare/understand any technology, framework, vendor, methodology, or domain — you MUST invoke the research skill. Do not improvise a search plan, do not start WebSearch blind, do not skip the cache check. Read .claude/skills/research/SKILL.md first, then follow its entry flow precisely. The skill uses source-first scout (cache check + scoped plan) BEFORE touching the web, and emits a report-then-ask gate after scouting. Every non-meta claim must carry the URL+QUOTE+ACCESSED-AT+VERIFY-METHOD evidence quad. Output goes to /docs/research/<topic>.md, never to .claude/skills/research-cache/. Triangulation follows Denzin's 4 types — republication chains count as one source, not three. Content-type freshness has 4 buckets (fast/medium/slow/permanent); do not apply fast-bucket aging to slow-bucket topics. Hand off to super-design for UX audits and e2e-audit for test audits — research does not replace either."}}
3
+ {"hookSpecificOutput":{"hookEventName":"SessionStart","additionalContext":"When the user mentions research, investigate, find info, search for, look up, evaluate, compare, audit literature, competitor analysis, market research, library evaluation, prior art, pesquisar, pesquisa, pesquise, investigar, buscar info, procurar info, comparar, avaliar biblioteca, análise de mercado, análise de concorrentes — or asks to evaluate/compare/understand any technology, framework, vendor, methodology, or domain — you MUST invoke the research skill. Do not improvise a search plan, do not start WebSearch blind, do not skip the cache check. Read .claude/skills/research/SKILL.md first, then follow its entry flow precisely. The skill uses source-first scout (cache check + scoped plan) BEFORE touching the web, then auto-dispatches research-query immediately (no confirmation gate user can interrupt mid-run). Every non-meta claim must carry the URL+QUOTE+ACCESSED-AT+VERIFY-METHOD evidence quad. Output goes to /docs/research/<topic>.md, never to .claude/skills/research-cache/. Triangulation follows Denzin's 4 types — republication chains count as one source, not three. Content-type freshness has 4 buckets (fast/medium/slow/permanent); do not apply fast-bucket aging to slow-bucket topics. Hand off to super-design for UX audits and e2e-audit for test audits — research does not replace either."}}
4
4
  EOF
@@ -27,7 +27,8 @@ Four-phase pipeline with 4 specialist agents:
27
27
  1. **Scout** (research-scout, Haiku) — decomposes the user's question, checks
28
28
  `/docs/research/` for existing fresh findings (content-type-calibrated
29
29
  freshness — fast/medium/slow/permanent buckets), proposes a scoped research
30
- plan + estimated query budget. Stops here for `report-then-ask` gate.
30
+ plan + estimated query budget. Hands directly to Query — NO confirmation
31
+ gate (auto-proceed). User can interrupt mid-run if needed.
31
32
  2. **Query** (research-query, Sonnet) — executes web/library searches in
32
33
  parallel, fetches pages via WebFetch + context7 (for library docs),
33
34
  extracts atomic claims with URL+QUOTE+ACCESSED-AT evidence, dumps to
@@ -84,20 +85,18 @@ Pass `cache-check.json` + the user question. Scout returns
84
85
  }
85
86
  ```
86
87
 
87
- ### Step 3 — Report-then-ask (HARD STOP)
88
+ ### Step 3 — Auto-proceed (no confirmation gate)
88
89
 
89
- Present a ≤6-line summary to the user:
90
+ Print a ≤4-line summary for visibility, then immediately dispatch Query.
91
+ Do NOT wait for user confirmation. The user can `/cancel` or interrupt
92
+ mid-run if scope is wrong.
90
93
 
91
94
  ```
92
- Topic: react-server-components-data-fetching
93
- Domain: software-engineering · Playbook: library-evaluation
94
- Plan: 3 sub-questions, ~12 queries, ~8 min
95
- Cache: delta-update of /docs/research/react-server-components-data-fetching.md (47d old)
96
- Proceed? (y / scope-down / cancel)
95
+ Topic: <slug> · Plan: <N> sub-questions, ~<Q> queries, ~<M> min
96
+ Cache: <strategy> · Proceeding to query...
97
97
  ```
98
98
 
99
- Do NOT proceed to Step 4 without an explicit reply. Mirror the e2e-audit
100
- report-then-ask gate.
99
+ Exception: if `--dry-run` flag is set, stop here and return scout-plan.json.
101
100
 
102
101
  ### Step 4 — Query (Task tool → research-query)
103
102
 
@@ -253,7 +252,7 @@ fingerprints + ownership trees.
253
252
  ## Hard rules
254
253
 
255
254
  1. **Cache first**. Never burn query tokens on a fresh cache hit.
256
- 2. **Report-then-ask**. After scout, STOP for user confirmation before query.
255
+ 2. **Auto-proceed after scout**. Print summary, immediately dispatch query. NO confirmation gate. User can interrupt mid-run.
257
256
  3. **Every claim has URL+QUOTE+ACCESSED-AT+VERIFY-METHOD**. Verify gate fails closed on violations.
258
257
  4. **No fabricated citations, ever**. If a quote cannot be greppped in the fetched page, the claim is dropped.
259
258
  5. **Triangulate by Denzin type, not raw count**. 3 republications of one wire story = 1 source.
@@ -1,13 +1,13 @@
1
1
  ---
2
2
  name: super-design
3
3
  description: >
4
- Performs end-to-end design audits on web projects. MUST BE USED when the user
5
- mentions super-design, design audit, UX review, accessibility review, design
6
- critique, competitor analysis, WCAG, Core Web Vitals, usability audit, or asks
7
- to evaluate their website's design quality. Produces market analysis, live-site
8
- UX audit (WCAG 2.2 AA, Nielsen heuristics, Baymard, CWV), and synthesized
9
- overview. Re-audits only what changed since last run. On explicit user request,
10
- applies surgical fixes with full rollback.
4
+ Performs end-to-end design audits on web projects. MUST BE USED when the user
5
+ mentions super-design, design audit, UX review, accessibility review, design
6
+ critique, competitor analysis, WCAG, Core Web Vitals, usability audit, or asks
7
+ to evaluate their website's design quality. Produces market analysis, live-site
8
+ UX audit (WCAG 2.2 AA, Nielsen heuristics, Baymard, CWV), and synthesized
9
+ overview. Re-audits only what changed since last run. On explicit user request,
10
+ applies surgical fixes with full rollback.
11
11
  version: 0.7.0
12
12
  ---
13
13
 
@@ -23,41 +23,41 @@ Four-phase pipeline with 6 specialist agents:
23
23
  produces market-analysis.md + component-comparison.md.
24
24
  2. **UI/UX audit** (sd-audit) — drives browser via Playwright MCP directly.
25
25
  Six layers:
26
- - Route discovery + static snap (Nielsen + WCAG 2.2 AA + Baymard + CWV)
27
- - **Step 1.5 source-first discovery** (0.7.0+) —
28
- `discover-surfaces.sh` reads the repo FIRST and emits an authoritative
29
- inventory of modals, forms, triggers, internal nav, and Next.js
30
- layout/error/loading/not-found/parallel/intercepting routes BEFORE
31
- Playwright runs. `extract-project-rules.sh` parses FORBIDDEN tables
32
- from CLAUDE.md / AGENTS.md / .cursorrules into audit-applicable
33
- rules. Runtime cross-checks surface these as `modal-coverage-gap`,
34
- `form-coverage-gap`, and `project-forbidden-<slug>` findings.
35
- - **Step 2.5 component/modal/flow discovery** (Phase A inventory, B modal
36
- enumeration, C flow exercising, D state matrix, E form coverage) — this
37
- is where modal contents, empty/loading/error states, and flow errors
38
- get real evidence instead of "checklist hypothetical". Phase B now
39
- cross-references `surfaces.json` and files a `modal-coverage-gap`
40
- finding for anything declared in source but never opened.
41
- - **Step 3g design-intelligence scoring** (17-category rubric → DIS 0–100)
42
- catches implicit best practices checklists miss (cards-in-flex-col,
43
- low density, weak CTA hierarchy, vibecode smell). Emits MANDATORY
44
- `design-intelligence-craft-summary` finding per page × viewport so
45
- overview.md has one holistic verdict row ("admin mobile is 38/100
46
- WEAK — holistic redesign scope") per combination, not just discrete
47
- per-category findings.
48
- - **Step 3h mobile-native audit** (21-item Duolingo/Linear/Arc/Cash-App
49
- checklist) — replaces "responsive-web-on-a-phone" thinking.
50
- - **Step 3i project-rule enforcement** (0.7.0+) — consumes
51
- `project-rules.json` and fires primary findings keyed to the
52
- project's own FORBIDDEN wording (e.g. `project-forbidden-use-cards-on-mobile`)
53
- when the rule is violated at runtime. Not a tag, not a severity
54
- bump — the project owner's rule IS the rule source.
55
- - C16 ≤ 4 → **DSC-choice** proposal: sd-synthesis runs
56
- `scripts/score-typeui.mjs --from-audit <dir>` to derive a 7-axis site
57
- fingerprint (density/contrast/geometry/color/typography/motion/audience)
58
- and rank all installed typeui-* skills by fit 0-1. Top-3 are written to
59
- `typeui-proposal.json` + shown in overview.md with a copy-paste CLI.
60
- Produces findings.json + design-intelligence.json with SHOT+QUOTE+SEL+VAL.
26
+ - Route discovery + static snap (Nielsen + WCAG 2.2 AA + Baymard + CWV)
27
+ - **Step 1.5 source-first discovery** (0.7.0+) —
28
+ `discover-surfaces.sh` reads the repo FIRST and emits an authoritative
29
+ inventory of modals, forms, triggers, internal nav, and Next.js
30
+ layout/error/loading/not-found/parallel/intercepting routes BEFORE
31
+ Playwright runs. `extract-project-rules.sh` parses FORBIDDEN tables
32
+ from CLAUDE.md / AGENTS.md / .cursorrules into audit-applicable
33
+ rules. Runtime cross-checks surface these as `modal-coverage-gap`,
34
+ `form-coverage-gap`, and `project-forbidden-<slug>` findings.
35
+ - **Step 2.5 component/modal/flow discovery** (Phase A inventory, B modal
36
+ enumeration, C flow exercising, D state matrix, E form coverage) — this
37
+ is where modal contents, empty/loading/error states, and flow errors
38
+ get real evidence instead of "checklist hypothetical". Phase B now
39
+ cross-references `surfaces.json` and files a `modal-coverage-gap`
40
+ finding for anything declared in source but never opened.
41
+ - **Step 3g design-intelligence scoring** (17-category rubric → DIS 0–100)
42
+ catches implicit best practices checklists miss (cards-in-flex-col,
43
+ low density, weak CTA hierarchy, vibecode smell). Emits MANDATORY
44
+ `design-intelligence-craft-summary` finding per page × viewport so
45
+ overview.md has one holistic verdict row ("admin mobile is 38/100
46
+ WEAK — holistic redesign scope") per combination, not just discrete
47
+ per-category findings.
48
+ - **Step 3h mobile-native audit** (21-item Duolingo/Linear/Arc/Cash-App
49
+ checklist) — replaces "responsive-web-on-a-phone" thinking.
50
+ - **Step 3i project-rule enforcement** (0.7.0+) — consumes
51
+ `project-rules.json` and fires primary findings keyed to the
52
+ project's own FORBIDDEN wording (e.g. `project-forbidden-use-cards-on-mobile`)
53
+ when the rule is violated at runtime. Not a tag, not a severity
54
+ bump — the project owner's rule IS the rule source.
55
+ - C16 ≤ 4 → **DSC-choice** proposal: sd-synthesis runs
56
+ `scripts/score-typeui.mjs --from-audit <dir>` to derive a 7-axis site
57
+ fingerprint (density/contrast/geometry/color/typography/motion/audience)
58
+ and rank all installed typeui-\* skills by fit 0-1. Top-3 are written to
59
+ `typeui-proposal.json` + shown in overview.md with a copy-paste CLI.
60
+ Produces findings.json + design-intelligence.json with SHOT+QUOTE+SEL+VAL.
61
61
  3. **Synthesis** (sd-synthesis) — unifies research + audit + design-intelligence
62
62
  into overview.md (per-page DIS table + executive summary + typeui proposal
63
63
  when craft ≤ 4).
@@ -67,7 +67,7 @@ Four-phase pipeline with 6 specialist agents:
67
67
  A1-A15 a11y · V1-V8 design · U1-U10 ux · P1-P10 perf · **M1-M15 mobile**
68
68
  (cards-in-flex-col → compact list, table-on-mobile → card-per-row,
69
69
  centered-modal → bottom-sheet, etc.) · **DSC-1 design-skill advisory**
70
- (proposes typeui-* direction). With `--typeui <name>` flag, sd-fix loads
70
+ (proposes typeui-\* direction). With `--typeui <name>` flag, sd-fix loads
71
71
  the chosen typeui skill (tokens: primary, radius, spacing scale,
72
72
  typography) and rebrands V1-V8 fixes so every spacing/color/radius target
73
73
  snaps to that direction's scale — turning advisory into applied. After each
@@ -91,6 +91,7 @@ fi
91
91
  ### Step 2: Scope decision (if incremental)
92
92
 
93
93
  Apply cascade from `references/change-detection-playbook.md` §4:
94
+
94
95
  - Run `scripts/detect-changes.sh $LAST_SHA` → classify changed files.
95
96
  - theory_doc_sha changed OR tokens changed OR major dep bump OR >180d old → FULL.
96
97
  - Only components changed → re-audit pages importing them (N=3 hops via madge).
@@ -109,6 +110,8 @@ Record scope in overview.md changelog banner.
109
110
  4. sd-fix (ONLY if user asked; uses verify-technical + verify-semantic)
110
111
  ```
111
112
 
113
+ **MCP serialization (HARD RULE):** Dispatch agents in this order **strictly sequentially** — wait for each agent to fully return before launching the next. NEVER launch two Playwright-MCP-using agents in the same message or while another is still running. Playwright MCP holds a single shared browser session; concurrent agents collide on the tab and the run bugs out (snapshots interleave, navigations race, sessions dangle). MCP-using agents that MUST be serialized: `sd-research`, `sd-audit`, `sd-fix`, `sd-fix-verify-semantic`. The only agent safe to run alongside (no MCP) is `sd-synthesis`, but per the order above it runs after the MCP agents anyway.
114
+
112
115
  Pass findings via files under `.super-design/sessions/<id>/`, not chat.
113
116
 
114
117
  ### Step 4: Write state + history
@@ -143,7 +146,7 @@ Do NOT paste overview into chat.
143
146
  block. Without this flag, DSC-1 stays advisory. Run
144
147
  `bash scripts/score-typeui.mjs --list` to see all installed directions.
145
148
  - `--harvest-typeui` — run `scripts/harvest-typeui.sh` first to pull
146
- missing typeui-* and related design skills (typeui.sh registry with
149
+ missing typeui-\* and related design skills (typeui.sh registry with
147
150
  built-in catalog fallback). Idempotent; add `--refresh` to re-download.
148
151
  - `--dry-run` — artifacts without committing state
149
152
  - `--ci` — non-interactive, create PR, exit non-zero on blockers
@@ -161,13 +164,13 @@ is auto-detected; nothing else to configure.
161
164
 
162
165
  `scripts/detect-apps.sh` reads the first workspace manifest it finds:
163
166
 
164
- | Manifest | Source of globs |
165
- |----------|-----------------|
166
- | `pnpm-workspace.yaml` | `packages:` list |
167
- | `package.json` | `workspaces: [...]` or `workspaces.packages: [...]` (npm, yarn, Bun) |
168
- | `turbo.json` | Presence → uses pnpm/npm/yarn workspaces; falls back to `apps/*` + `packages/*` if none |
169
- | `nx.json` | `workspaceLayout.appsDir` / `libsDir` (default `apps/*`, `libs/*`) |
170
- | `bunfig.toml` | Presence → falls back to `apps/*` + `packages/*` if package.json has no workspaces |
167
+ | Manifest | Source of globs |
168
+ | --------------------- | --------------------------------------------------------------------------------------- |
169
+ | `pnpm-workspace.yaml` | `packages:` list |
170
+ | `package.json` | `workspaces: [...]` or `workspaces.packages: [...]` (npm, yarn, Bun) |
171
+ | `turbo.json` | Presence → uses pnpm/npm/yarn workspaces; falls back to `apps/*` + `packages/*` if none |
172
+ | `nx.json` | `workspaceLayout.appsDir` / `libsDir` (default `apps/*`, `libs/*`) |
173
+ | `bunfig.toml` | Presence → falls back to `apps/*` + `packages/*` if package.json has no workspaces |
171
174
 
172
175
  Each matched directory that also has a `package.json` becomes an app
173
176
  with `name` taken from `package.json#name` (scope stripped), `path` the
@@ -231,14 +234,14 @@ Windows git-bash + Linux.
231
234
  (`{nodes, edges, hash, backend}`) and persists `import_graph_sha` to
232
235
  state. Prefers `npx madge --json <roots>`; falls back to a regex
233
236
  scanner (JS/TS only, no alias resolution) if madge is missing.
234
- - Query: `bash .../build-import-graph.sh importers <file> --hops 3`
235
- → BFS over reversed edges; `detect-changes.sh` uses this to close
236
- the component→pages gap when only components changed (Step 2 scope
237
- decision: "Only components changed → re-audit pages importing them
238
- (N=3 hops via madge)").
237
+ - Query: `bash .../build-import-graph.sh importers <file> --hops 3`
238
+ → BFS over reversed edges; `detect-changes.sh` uses this to close
239
+ the component→pages gap when only components changed (Step 2 scope
240
+ decision: "Only components changed → re-audit pages importing them
241
+ (N=3 hops via madge)").
239
242
  - `hash-pages.sh` — captures 3 viewports per URL (mobile_375, tablet_768,
240
243
  desktop_1280), emits `{html_hash, dom_structure_hash, viewport_hashes:
241
- {<vp>: {sha256, phash, png_path}}}` per page to
244
+ {<vp>: {sha256, phash, png_path}}}` per page to
242
245
  `docs/super-design/.cache/hashes/hashes.json` and persists each PNG to
243
246
  `<cache>/screenshots/<url-enc>/<vp>.png`. Applies artifact §3.4 mask
244
247
  defaults plus `MASK_SELECTORS`; `phash` uses `sharp` when available