start-vibing 4.4.2 → 4.4.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (28) hide show
  1. package/package.json +1 -1
  2. package/template/.claude/agents/research-query.md +128 -0
  3. package/template/.claude/agents/research-scout.md +124 -0
  4. package/template/.claude/agents/research-synthesize.md +139 -0
  5. package/template/.claude/agents/research-verify.md +84 -0
  6. package/template/.claude/commands/research.md +18 -0
  7. package/template/.claude/hooks/format-on-edit.sh +26 -0
  8. package/template/.claude/hooks/git-context-session-start.sh +22 -0
  9. package/template/.claude/hooks/quality-gate-stop.sh +46 -0
  10. package/template/.claude/hooks/research-session-start.sh +4 -0
  11. package/template/.claude/settings.json +29 -0
  12. package/template/.claude/skills/research/SKILL.md +285 -0
  13. package/template/.claude/skills/research/references/domain-playbooks.md +604 -0
  14. package/template/.claude/skills/research/references/ontology-patterns.md +376 -0
  15. package/template/.claude/skills/research/references/research-methodology.md +794 -0
  16. package/template/.claude/skills/research/references/source-directory.md +280 -0
  17. package/template/.claude/skills/research/scripts/__pycache__/extract-claims.cpython-313.pyc +0 -0
  18. package/template/.claude/skills/research/scripts/check-cache.sh +129 -0
  19. package/template/.claude/skills/research/scripts/dedup-research.sh +80 -0
  20. package/template/.claude/skills/research/scripts/extract-claims.py +83 -0
  21. package/template/.claude/skills/research/scripts/update-index.sh +106 -0
  22. package/template/.claude/skills/research/scripts/verify-citations.sh +107 -0
  23. package/template/.claude/skills/research/templates/adr.md.tpl +66 -0
  24. package/template/.claude/skills/research/templates/index.md.tpl +25 -0
  25. package/template/.claude/skills/research/templates/moc.md.tpl +39 -0
  26. package/template/.claude/skills/research/templates/research-state.schema.json +64 -0
  27. package/template/.claude/skills/research/templates/research.md.tpl +117 -0
  28. package/template/.claude/agents/research-web.md +0 -164
@@ -0,0 +1,285 @@
1
+ ---
2
+ name: research
3
+ version: 0.1.0
4
+ description: >
5
+ Performs Baymard-Institute-grade research on any topic the user asks about
6
+ (UX patterns, library evaluation, market analysis, academic literature, API
7
+ integration, architectural decisions). MUST BE USED when the user mentions
8
+ research, investigate, find info, search for, look up, pesquisar, pesquisa,
9
+ pesquise, investigar, or asks to evaluate / compare / understand any
10
+ technology, framework, vendor, methodology, or domain. Source-first
11
+ pipeline: scout → query → synthesize → verify. Output goes to
12
+ /docs/research/<topic>.md with URL+QUOTE+ACCESSED-AT+VERIFY-METHOD evidence
13
+ per claim. Re-uses cached research when fresh, calibrated by content-type.
14
+ ---
15
+
16
+ # research — evidence-backed knowledge production
17
+
18
+ > **Operating principle**: every claim in research output must be defensible
19
+ > to a skeptical stakeholder. URL resolves, quote is in source, source is
20
+ > independent. Fabricated citations are the worst possible failure mode —
21
+ > the verify agent fails closed on them.
22
+
23
+ ## What this skill does
24
+
25
+ Four-phase pipeline with 4 specialist agents:
26
+
27
+ 1. **Scout** (research-scout, Haiku) — decomposes the user's question, checks
28
+ `/docs/research/` for existing fresh findings (content-type-calibrated
29
+ freshness — fast/medium/slow/permanent buckets), proposes a scoped research
30
+ plan + estimated query budget. Stops here for `report-then-ask` gate.
31
+ 2. **Query** (research-query, Sonnet) — executes web/library searches in
32
+ parallel, fetches pages via WebFetch + context7 (for library docs),
33
+ extracts atomic claims with URL+QUOTE+ACCESSED-AT evidence, dumps to
34
+ `claims.jsonl`.
35
+ 3. **Synthesize** (research-synthesize, Sonnet) — builds a lightweight
36
+ SKOS-adapted ontology, triangulates each claim across ≥3 independent
37
+ sources (Denzin's 4 types, not just count), produces final
38
+ `/docs/research/<topic>.md` and updates `index.md`.
39
+ 4. **Verify** (research-verify, Haiku) — anti-hallucination gate. For every
40
+ citation in the final doc: resolves URL, greps the literal quote, checks
41
+ DOI via Crossref. Fails closed on any unverified citation. Writes
42
+ `verify.json` with per-citation status.
43
+
44
+ ## Entry flow
45
+
46
+ ### Step 1 — Preflight + cache check
47
+
48
+ ```bash
49
+ TOPIC_SLUG=$(echo "$USER_QUESTION" | bash .claude/skills/research/scripts/check-cache.sh --slugify)
50
+ SESSION_DIR="docs/research/.cache/sessions/$(date +%Y-%m-%d-%H%M%S)-$TOPIC_SLUG"
51
+ mkdir -p "$SESSION_DIR"
52
+
53
+ bash .claude/skills/research/scripts/check-cache.sh \
54
+ --topic "$TOPIC_SLUG" \
55
+ --question "$USER_QUESTION" \
56
+ > "$SESSION_DIR/cache-check.json"
57
+ ```
58
+
59
+ `cache-check.json` reports: existing doc path (if any), age in days,
60
+ content-type bucket (fast/medium/slow/permanent), freshness verdict
61
+ (fresh / aging / stale / outdated), recommended action
62
+ (reuse | delta-update | full-research).
63
+
64
+ If verdict is `reuse`, return the existing doc path to the user and exit. Do
65
+ not burn query tokens on a cache hit.
66
+
67
+ ### Step 2 — Scout (Task tool → research-scout)
68
+
69
+ Pass `cache-check.json` + the user question. Scout returns
70
+ `scout-plan.json`:
71
+
72
+ ```jsonc
73
+ {
74
+ "topic_slug": "react-server-components-data-fetching",
75
+ "question": "...",
76
+ "decomposition": ["sub-q1", "sub-q2", "..."],
77
+ "domain": "software-engineering",
78
+ "playbook": "library-evaluation", // from references/domain-playbooks.md
79
+ "content_type_bucket": "fast", // fast | medium | slow | permanent
80
+ "estimated_queries": 12,
81
+ "estimated_minutes": 8,
82
+ "cache_strategy": "delta-update", // from cache-check.json
83
+ "blockers": [],
84
+ }
85
+ ```
86
+
87
+ ### Step 3 — Report-then-ask (HARD STOP)
88
+
89
+ Present a ≤6-line summary to the user:
90
+
91
+ ```
92
+ Topic: react-server-components-data-fetching
93
+ Domain: software-engineering · Playbook: library-evaluation
94
+ Plan: 3 sub-questions, ~12 queries, ~8 min
95
+ Cache: delta-update of /docs/research/react-server-components-data-fetching.md (47d old)
96
+ Proceed? (y / scope-down / cancel)
97
+ ```
98
+
99
+ Do NOT proceed to Step 4 without an explicit reply. Mirror the e2e-audit
100
+ report-then-ask gate.
101
+
102
+ ### Step 4 — Query (Task tool → research-query)
103
+
104
+ Dispatch with `scout-plan.json`. Agent runs parallel searches, writes
105
+ `claims.jsonl` (one atomic claim per line) and `sources.jsonl` (one source
106
+ per line, with accessed-at timestamp). Each claim references its source by
107
+ ID. Hard rule: every claim has at least one verbatim quote from its source.
108
+
109
+ ### Step 5 — Synthesize (Task tool → research-synthesize)
110
+
111
+ Dispatch with `claims.jsonl` + `sources.jsonl`. Agent:
112
+
113
+ 1. Builds ontology from `references/ontology-patterns.md` vocabulary.
114
+ 2. Triangulates: groups claims by assertion, requires ≥3 INDEPENDENT
115
+ sources (Denzin types, not just count) for high-confidence claims.
116
+ 3. Renders `/docs/research/<topic-slug>.md` from
117
+ `templates/research.md.tpl`.
118
+ 4. Writes ADR to `/docs/research/decisions/NNNN-<slug>.md` if the
119
+ user's question implies a decision.
120
+ 5. Calls `scripts/update-index.sh` to regenerate `/docs/research/index.md`
121
+ and any MOCs.
122
+
123
+ ### Step 6 — Verify (Task tool → research-verify)
124
+
125
+ Dispatch with the rendered doc. Agent runs
126
+ `scripts/verify-citations.sh <doc>` which:
127
+
128
+ - For each `Source` row → fetches URL, checks HTTP 200, greps the
129
+ associated quote.
130
+ - For DOIs → hits Crossref API.
131
+ - Writes `verify.json` to the session dir.
132
+ - Returns non-zero on any failed citation.
133
+
134
+ If verify fails, the synthesize agent is re-dispatched with the failure
135
+ report to fix or remove unverifiable claims. Three failed verify rounds →
136
+ abort and surface findings to the user.
137
+
138
+ ### Step 7 — Persist + summarize
139
+
140
+ ```bash
141
+ bash .claude/skills/research/scripts/update-index.sh
142
+ echo "$TOPIC_SLUG $(date -u +%Y-%m-%dT%H:%M:%SZ) $VERIFY_STATUS" \
143
+ >> docs/research/.research-state.jsonl
144
+ ```
145
+
146
+ Return ≤5 sentences to the user: doc path, claim count, sources cited,
147
+ confidence levels, open questions count. Do NOT paste the doc body.
148
+
149
+ ## User flags
150
+
151
+ - `--force-fresh` — ignore cache, full research even if fresh exists
152
+ - `--delta-only` — only update sections that changed since the cached version
153
+ - `--scope <bucket>` — narrow content-type bucket (fast | medium | slow | permanent)
154
+ - `--playbook <name>` — override playbook detection (from `references/domain-playbooks.md`)
155
+ - `--no-verify` — skip verify gate (NOT recommended; only for offline runs)
156
+ - `--lang <code>` — output language (default: `en`; accepts `pt`, `es`, etc.)
157
+ - `--max-queries <N>` — cap total queries (default 20)
158
+ - `--dry-run` — produce scout-plan.json then stop
159
+
160
+ ## Output layout
161
+
162
+ ```
163
+ docs/research/
164
+ ├── index.md # auto-regenerated by update-index.sh
165
+ ├── .research-state.jsonl # append-only audit log
166
+ ├── <topic-slug>.md # one per topic — main deliverable
167
+ ├── decisions/
168
+ │ ├── index.md
169
+ │ └── NNNN-<slug>.md # ADR (Nygard 2011 format)
170
+ ├── moc/ # Maps of Content (Nick Milo)
171
+ │ └── <theme>.md
172
+ └── .cache/
173
+ └── sessions/<id>/
174
+ ├── cache-check.json
175
+ ├── scout-plan.json
176
+ ├── claims.jsonl
177
+ ├── sources.jsonl
178
+ ├── verify.json
179
+ └── snapshots/<n>.html # WebFetched page caches for grep
180
+ ```
181
+
182
+ ## Evidence protocol — URL+QUOTE+ACCESSED-AT+VERIFY-METHOD
183
+
184
+ Adapted from super-design's SHOT+QUOTE+SEL+VAL. Every non-meta claim in the
185
+ output ships:
186
+
187
+ | Field | Meaning |
188
+ | ----------------- | ---------------------------------------------------------------- |
189
+ | **URL** | Resolvable HTTP 200 URL OR DOI resolved through Crossref |
190
+ | **QUOTE** | Verbatim string greppable in the fetched source page |
191
+ | **ACCESSED-AT** | UTC ISO-8601 timestamp of the fetch |
192
+ | **VERIFY-METHOD** | `web-fetch` / `crossref-api` / `screenshot` / `archive-snapshot` |
193
+
194
+ `scripts/verify-citations.sh` enforces this contract. Coverage-gap and
195
+ "open question" findings are exempt (no claim to verify).
196
+
197
+ ## Freshness — content-type buckets (NOT one-size)
198
+
199
+ | Bucket | Examples | Fresh | Aging | Stale | Outdated |
200
+ | ------------- | --------------------------------------------------------- | ----- | ------- | -------- | -------- |
201
+ | **fast** | frontend frameworks, AI/LLM SOTA, cloud pricing | <30d | 30–90d | 90–180d | >180d |
202
+ | **medium** | established libraries, design patterns, UX best practices | <90d | 90–180d | 180–365d | >365d |
203
+ | **slow** | language fundamentals, CS theory, HCI research | <365d | 1–2y | 2–5y | >5y |
204
+ | **permanent** | math theorems, physical laws, historical facts | <5y | 5–10y | 10–20y | >20y |
205
+
206
+ Bucket detection lives in `scripts/check-cache.sh`. Override with `--scope`.
207
+
208
+ ## Triangulation — Denzin's 4 types, not "3 sources"
209
+
210
+ Per `references/research-methodology.md` §4. A claim achieves
211
+ **high-confidence** only when it survives ≥3 INDEPENDENT sources where
212
+ "independent" means satisfying ≥1 of:
213
+
214
+ - **Data triangulation** — different time/place/persons
215
+ - **Investigator triangulation** — different authors with no shared
216
+ funding/employer
217
+ - **Theoretical triangulation** — different theoretical framings reach
218
+ the same conclusion
219
+ - **Methodological triangulation** — different methods (survey vs
220
+ interview vs telemetry) converge
221
+
222
+ Republication chains and citation cascades count as **one** source.
223
+ `scripts/verify-citations.sh` flags suspected republication via shared DOM
224
+ fingerprints + ownership trees.
225
+
226
+ ## Scripts (`.claude/skills/research/scripts/`)
227
+
228
+ | Script | Purpose |
229
+ | --------------------- | ---------------------------------------------------------------------------------------------------- |
230
+ | `check-cache.sh` | Slugify topic, scan `/docs/research/`, classify content-type bucket, return reuse/delta/full verdict |
231
+ | `verify-citations.sh` | Per citation: HTTP 200, quote grep, DOI Crossref check, write verify.json |
232
+ | `dedup-research.sh` | Detect overlap between docs (jaccard on concept lists + citation overlap), suggest merge |
233
+ | `update-index.sh` | Regenerate `/docs/research/index.md` + per-folder indexes from frontmatter |
234
+ | `extract-claims.py` | Pull atomic claims with citations from a rendered doc into JSONL |
235
+
236
+ ## Templates (`.claude/skills/research/templates/`)
237
+
238
+ | Template | Output |
239
+ | ---------------------------- | ------------------------------------------- |
240
+ | `research.md.tpl` | Main `/docs/research/<slug>.md` deliverable |
241
+ | `adr.md.tpl` | Nygard ADR for decision questions |
242
+ | `moc.md.tpl` | Map of Content for cross-topic themes |
243
+ | `index.md.tpl` | TOC for `/docs/research/index.md` |
244
+ | `research-state.schema.json` | Schema for state JSONL entries |
245
+
246
+ ## References (read on demand)
247
+
248
+ - `references/research-methodology.md` — the 15-topic deep methodology bible
249
+ - `references/ontology-patterns.md` — SKOS-adapted relationship vocabulary + LLM-friendly ontology examples
250
+ - `references/source-directory.md` — per-domain authoritative sources, authority hierarchies, AI-content red flags
251
+ - `references/domain-playbooks.md` — step-by-step protocols per research domain (UX, library eval, API, ADR, market, academic, news, security, pricing)
252
+
253
+ ## Hard rules
254
+
255
+ 1. **Cache first**. Never burn query tokens on a fresh cache hit.
256
+ 2. **Report-then-ask**. After scout, STOP for user confirmation before query.
257
+ 3. **Every claim has URL+QUOTE+ACCESSED-AT+VERIFY-METHOD**. Verify gate fails closed on violations.
258
+ 4. **No fabricated citations, ever**. If a quote cannot be greppped in the fetched page, the claim is dropped.
259
+ 5. **Triangulate by Denzin type, not raw count**. 3 republications of one wire story = 1 source.
260
+ 6. **Content-type freshness**. Don't apply fast-bucket aging to slow-bucket topics or vice versa.
261
+ 7. **Output to `/docs/research/`** — never to `.claude/skills/research-cache/` (legacy).
262
+ 8. **English output by default** — even when triggered in Portuguese. Override with `--lang pt`.
263
+ 9. **Summary to user ≤5 sentences**. Doc body lives in the file.
264
+ 10. **Skill ⊥ super-design ⊥ e2e-audit**. If the user asked for a UX audit or test audit, hand off — do not improvise.
265
+
266
+ ## Boundaries (what this skill does NOT do)
267
+
268
+ - Does NOT implement code based on its own findings. Hand the doc to the user / implementing agent.
269
+ - Does NOT replace `super-design` for design audits or `e2e-audit` for test audits.
270
+ - Does NOT publish research outside `/docs/research/`.
271
+ - Does NOT invent fixtures or scrape behind paywalls.
272
+ - Does NOT bypass robots.txt or honor restrictions on cited sites.
273
+
274
+ ## Invocation triggers (enforced by SessionStart hook)
275
+
276
+ EN: `research`, `investigate`, `find info`, `search for`, `look up`,
277
+ `evaluate`, `compare`, `audit literature`, `competitor analysis`,
278
+ `market research`, `library evaluation`, `prior art`.
279
+
280
+ PT: `pesquisar`, `pesquisa`, `pesquise`, `investigar`, `buscar info`,
281
+ `procurar info`, `comparar`, `avaliar biblioteca`, `análise de
282
+ mercado`, `análise de concorrentes`.
283
+
284
+ The hook injects this context at session start. Claude must read this
285
+ SKILL.md before improvising a research plan.