start-vibing 4.4.3 → 4.4.4
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/package.json +1 -1
- package/template/.claude/agents/research-query.md +128 -0
- package/template/.claude/agents/research-scout.md +124 -0
- package/template/.claude/agents/research-synthesize.md +139 -0
- package/template/.claude/agents/research-verify.md +84 -0
- package/template/.claude/commands/research.md +18 -0
- package/template/.claude/hooks/research-session-start.sh +4 -0
- package/template/.claude/settings.json +4 -0
- package/template/.claude/skills/research/SKILL.md +285 -0
- package/template/.claude/skills/research/references/domain-playbooks.md +604 -0
- package/template/.claude/skills/research/references/ontology-patterns.md +376 -0
- package/template/.claude/skills/research/references/research-methodology.md +794 -0
- package/template/.claude/skills/research/references/source-directory.md +280 -0
- package/template/.claude/skills/research/scripts/__pycache__/extract-claims.cpython-313.pyc +0 -0
- package/template/.claude/skills/research/scripts/check-cache.sh +129 -0
- package/template/.claude/skills/research/scripts/dedup-research.sh +80 -0
- package/template/.claude/skills/research/scripts/extract-claims.py +83 -0
- package/template/.claude/skills/research/scripts/update-index.sh +106 -0
- package/template/.claude/skills/research/scripts/verify-citations.sh +107 -0
- package/template/.claude/skills/research/templates/adr.md.tpl +66 -0
- package/template/.claude/skills/research/templates/index.md.tpl +25 -0
- package/template/.claude/skills/research/templates/moc.md.tpl +39 -0
- package/template/.claude/skills/research/templates/research-state.schema.json +64 -0
- package/template/.claude/skills/research/templates/research.md.tpl +117 -0
- package/template/.claude/agents/research-web.md +0 -164
|
@@ -0,0 +1,285 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: research
|
|
3
|
+
version: 0.1.0
|
|
4
|
+
description: >
|
|
5
|
+
Performs Baymard-Institute-grade research on any topic the user asks about
|
|
6
|
+
(UX patterns, library evaluation, market analysis, academic literature, API
|
|
7
|
+
integration, architectural decisions). MUST BE USED when the user mentions
|
|
8
|
+
research, investigate, find info, search for, look up, pesquisar, pesquisa,
|
|
9
|
+
pesquise, investigar, or asks to evaluate / compare / understand any
|
|
10
|
+
technology, framework, vendor, methodology, or domain. Source-first
|
|
11
|
+
pipeline: scout → query → synthesize → verify. Output goes to
|
|
12
|
+
/docs/research/<topic>.md with URL+QUOTE+ACCESSED-AT+VERIFY-METHOD evidence
|
|
13
|
+
per claim. Re-uses cached research when fresh, calibrated by content-type.
|
|
14
|
+
---
|
|
15
|
+
|
|
16
|
+
# research — evidence-backed knowledge production
|
|
17
|
+
|
|
18
|
+
> **Operating principle**: every claim in research output must be defensible
|
|
19
|
+
> to a skeptical stakeholder. URL resolves, quote is in source, source is
|
|
20
|
+
> independent. Fabricated citations are the worst possible failure mode —
|
|
21
|
+
> the verify agent fails closed on them.
|
|
22
|
+
|
|
23
|
+
## What this skill does
|
|
24
|
+
|
|
25
|
+
Four-phase pipeline with 4 specialist agents:
|
|
26
|
+
|
|
27
|
+
1. **Scout** (research-scout, Haiku) — decomposes the user's question, checks
|
|
28
|
+
`/docs/research/` for existing fresh findings (content-type-calibrated
|
|
29
|
+
freshness — fast/medium/slow/permanent buckets), proposes a scoped research
|
|
30
|
+
plan + estimated query budget. Stops here for `report-then-ask` gate.
|
|
31
|
+
2. **Query** (research-query, Sonnet) — executes web/library searches in
|
|
32
|
+
parallel, fetches pages via WebFetch + context7 (for library docs),
|
|
33
|
+
extracts atomic claims with URL+QUOTE+ACCESSED-AT evidence, dumps to
|
|
34
|
+
`claims.jsonl`.
|
|
35
|
+
3. **Synthesize** (research-synthesize, Sonnet) — builds a lightweight
|
|
36
|
+
SKOS-adapted ontology, triangulates each claim across ≥3 independent
|
|
37
|
+
sources (Denzin's 4 types, not just count), produces final
|
|
38
|
+
`/docs/research/<topic>.md` and updates `index.md`.
|
|
39
|
+
4. **Verify** (research-verify, Haiku) — anti-hallucination gate. For every
|
|
40
|
+
citation in the final doc: resolves URL, greps the literal quote, checks
|
|
41
|
+
DOI via Crossref. Fails closed on any unverified citation. Writes
|
|
42
|
+
`verify.json` with per-citation status.
|
|
43
|
+
|
|
44
|
+
## Entry flow
|
|
45
|
+
|
|
46
|
+
### Step 1 — Preflight + cache check
|
|
47
|
+
|
|
48
|
+
```bash
|
|
49
|
+
TOPIC_SLUG=$(echo "$USER_QUESTION" | bash .claude/skills/research/scripts/check-cache.sh --slugify)
|
|
50
|
+
SESSION_DIR="docs/research/.cache/sessions/$(date +%Y-%m-%d-%H%M%S)-$TOPIC_SLUG"
|
|
51
|
+
mkdir -p "$SESSION_DIR"
|
|
52
|
+
|
|
53
|
+
bash .claude/skills/research/scripts/check-cache.sh \
|
|
54
|
+
--topic "$TOPIC_SLUG" \
|
|
55
|
+
--question "$USER_QUESTION" \
|
|
56
|
+
> "$SESSION_DIR/cache-check.json"
|
|
57
|
+
```
|
|
58
|
+
|
|
59
|
+
`cache-check.json` reports: existing doc path (if any), age in days,
|
|
60
|
+
content-type bucket (fast/medium/slow/permanent), freshness verdict
|
|
61
|
+
(fresh / aging / stale / outdated), recommended action
|
|
62
|
+
(reuse | delta-update | full-research).
|
|
63
|
+
|
|
64
|
+
If verdict is `reuse`, return the existing doc path to the user and exit. Do
|
|
65
|
+
not burn query tokens on a cache hit.
|
|
66
|
+
|
|
67
|
+
### Step 2 — Scout (Task tool → research-scout)
|
|
68
|
+
|
|
69
|
+
Pass `cache-check.json` + the user question. Scout returns
|
|
70
|
+
`scout-plan.json`:
|
|
71
|
+
|
|
72
|
+
```jsonc
|
|
73
|
+
{
|
|
74
|
+
"topic_slug": "react-server-components-data-fetching",
|
|
75
|
+
"question": "...",
|
|
76
|
+
"decomposition": ["sub-q1", "sub-q2", "..."],
|
|
77
|
+
"domain": "software-engineering",
|
|
78
|
+
"playbook": "library-evaluation", // from references/domain-playbooks.md
|
|
79
|
+
"content_type_bucket": "fast", // fast | medium | slow | permanent
|
|
80
|
+
"estimated_queries": 12,
|
|
81
|
+
"estimated_minutes": 8,
|
|
82
|
+
"cache_strategy": "delta-update", // from cache-check.json
|
|
83
|
+
"blockers": [],
|
|
84
|
+
}
|
|
85
|
+
```
|
|
86
|
+
|
|
87
|
+
### Step 3 — Report-then-ask (HARD STOP)
|
|
88
|
+
|
|
89
|
+
Present a ≤6-line summary to the user:
|
|
90
|
+
|
|
91
|
+
```
|
|
92
|
+
Topic: react-server-components-data-fetching
|
|
93
|
+
Domain: software-engineering · Playbook: library-evaluation
|
|
94
|
+
Plan: 3 sub-questions, ~12 queries, ~8 min
|
|
95
|
+
Cache: delta-update of /docs/research/react-server-components-data-fetching.md (47d old)
|
|
96
|
+
Proceed? (y / scope-down / cancel)
|
|
97
|
+
```
|
|
98
|
+
|
|
99
|
+
Do NOT proceed to Step 4 without an explicit reply. Mirror the e2e-audit
|
|
100
|
+
report-then-ask gate.
|
|
101
|
+
|
|
102
|
+
### Step 4 — Query (Task tool → research-query)
|
|
103
|
+
|
|
104
|
+
Dispatch with `scout-plan.json`. Agent runs parallel searches, writes
|
|
105
|
+
`claims.jsonl` (one atomic claim per line) and `sources.jsonl` (one source
|
|
106
|
+
per line, with accessed-at timestamp). Each claim references its source by
|
|
107
|
+
ID. Hard rule: every claim has at least one verbatim quote from its source.
|
|
108
|
+
|
|
109
|
+
### Step 5 — Synthesize (Task tool → research-synthesize)
|
|
110
|
+
|
|
111
|
+
Dispatch with `claims.jsonl` + `sources.jsonl`. Agent:
|
|
112
|
+
|
|
113
|
+
1. Builds ontology from `references/ontology-patterns.md` vocabulary.
|
|
114
|
+
2. Triangulates: groups claims by assertion, requires ≥3 INDEPENDENT
|
|
115
|
+
sources (Denzin types, not just count) for high-confidence claims.
|
|
116
|
+
3. Renders `/docs/research/<topic-slug>.md` from
|
|
117
|
+
`templates/research.md.tpl`.
|
|
118
|
+
4. Writes ADR to `/docs/research/decisions/NNNN-<slug>.md` if the
|
|
119
|
+
user's question implies a decision.
|
|
120
|
+
5. Calls `scripts/update-index.sh` to regenerate `/docs/research/index.md`
|
|
121
|
+
and any MOCs.
|
|
122
|
+
|
|
123
|
+
### Step 6 — Verify (Task tool → research-verify)
|
|
124
|
+
|
|
125
|
+
Dispatch with the rendered doc. Agent runs
|
|
126
|
+
`scripts/verify-citations.sh <doc>` which:
|
|
127
|
+
|
|
128
|
+
- For each `Source` row → fetches URL, checks HTTP 200, greps the
|
|
129
|
+
associated quote.
|
|
130
|
+
- For DOIs → hits Crossref API.
|
|
131
|
+
- Writes `verify.json` to the session dir.
|
|
132
|
+
- Returns non-zero on any failed citation.
|
|
133
|
+
|
|
134
|
+
If verify fails, the synthesize agent is re-dispatched with the failure
|
|
135
|
+
report to fix or remove unverifiable claims. Three failed verify rounds →
|
|
136
|
+
abort and surface findings to the user.
|
|
137
|
+
|
|
138
|
+
### Step 7 — Persist + summarize
|
|
139
|
+
|
|
140
|
+
```bash
|
|
141
|
+
bash .claude/skills/research/scripts/update-index.sh
|
|
142
|
+
echo "$TOPIC_SLUG $(date -u +%Y-%m-%dT%H:%M:%SZ) $VERIFY_STATUS" \
|
|
143
|
+
>> docs/research/.research-state.jsonl
|
|
144
|
+
```
|
|
145
|
+
|
|
146
|
+
Return ≤5 sentences to the user: doc path, claim count, sources cited,
|
|
147
|
+
confidence levels, open questions count. Do NOT paste the doc body.
|
|
148
|
+
|
|
149
|
+
## User flags
|
|
150
|
+
|
|
151
|
+
- `--force-fresh` — ignore cache, full research even if fresh exists
|
|
152
|
+
- `--delta-only` — only update sections that changed since the cached version
|
|
153
|
+
- `--scope <bucket>` — narrow content-type bucket (fast | medium | slow | permanent)
|
|
154
|
+
- `--playbook <name>` — override playbook detection (from `references/domain-playbooks.md`)
|
|
155
|
+
- `--no-verify` — skip verify gate (NOT recommended; only for offline runs)
|
|
156
|
+
- `--lang <code>` — output language (default: `en`; accepts `pt`, `es`, etc.)
|
|
157
|
+
- `--max-queries <N>` — cap total queries (default 20)
|
|
158
|
+
- `--dry-run` — produce scout-plan.json then stop
|
|
159
|
+
|
|
160
|
+
## Output layout
|
|
161
|
+
|
|
162
|
+
```
|
|
163
|
+
docs/research/
|
|
164
|
+
├── index.md # auto-regenerated by update-index.sh
|
|
165
|
+
├── .research-state.jsonl # append-only audit log
|
|
166
|
+
├── <topic-slug>.md # one per topic — main deliverable
|
|
167
|
+
├── decisions/
|
|
168
|
+
│ ├── index.md
|
|
169
|
+
│ └── NNNN-<slug>.md # ADR (Nygard 2011 format)
|
|
170
|
+
├── moc/ # Maps of Content (Nick Milo)
|
|
171
|
+
│ └── <theme>.md
|
|
172
|
+
└── .cache/
|
|
173
|
+
└── sessions/<id>/
|
|
174
|
+
├── cache-check.json
|
|
175
|
+
├── scout-plan.json
|
|
176
|
+
├── claims.jsonl
|
|
177
|
+
├── sources.jsonl
|
|
178
|
+
├── verify.json
|
|
179
|
+
└── snapshots/<n>.html # WebFetched page caches for grep
|
|
180
|
+
```
|
|
181
|
+
|
|
182
|
+
## Evidence protocol — URL+QUOTE+ACCESSED-AT+VERIFY-METHOD
|
|
183
|
+
|
|
184
|
+
Adapted from super-design's SHOT+QUOTE+SEL+VAL. Every non-meta claim in the
|
|
185
|
+
output ships:
|
|
186
|
+
|
|
187
|
+
| Field | Meaning |
|
|
188
|
+
| ----------------- | ---------------------------------------------------------------- |
|
|
189
|
+
| **URL** | Resolvable HTTP 200 URL OR DOI resolved through Crossref |
|
|
190
|
+
| **QUOTE** | Verbatim string greppable in the fetched source page |
|
|
191
|
+
| **ACCESSED-AT** | UTC ISO-8601 timestamp of the fetch |
|
|
192
|
+
| **VERIFY-METHOD** | `web-fetch` / `crossref-api` / `screenshot` / `archive-snapshot` |
|
|
193
|
+
|
|
194
|
+
`scripts/verify-citations.sh` enforces this contract. Coverage-gap and
|
|
195
|
+
"open question" findings are exempt (no claim to verify).
|
|
196
|
+
|
|
197
|
+
## Freshness — content-type buckets (NOT one-size)
|
|
198
|
+
|
|
199
|
+
| Bucket | Examples | Fresh | Aging | Stale | Outdated |
|
|
200
|
+
| ------------- | --------------------------------------------------------- | ----- | ------- | -------- | -------- |
|
|
201
|
+
| **fast** | frontend frameworks, AI/LLM SOTA, cloud pricing | <30d | 30–90d | 90–180d | >180d |
|
|
202
|
+
| **medium** | established libraries, design patterns, UX best practices | <90d | 90–180d | 180–365d | >365d |
|
|
203
|
+
| **slow** | language fundamentals, CS theory, HCI research | <365d | 1–2y | 2–5y | >5y |
|
|
204
|
+
| **permanent** | math theorems, physical laws, historical facts | <5y | 5–10y | 10–20y | >20y |
|
|
205
|
+
|
|
206
|
+
Bucket detection lives in `scripts/check-cache.sh`. Override with `--scope`.
|
|
207
|
+
|
|
208
|
+
## Triangulation — Denzin's 4 types, not "3 sources"
|
|
209
|
+
|
|
210
|
+
Per `references/research-methodology.md` §4. A claim achieves
|
|
211
|
+
**high-confidence** only when it survives ≥3 INDEPENDENT sources where
|
|
212
|
+
"independent" means satisfying ≥1 of:
|
|
213
|
+
|
|
214
|
+
- **Data triangulation** — different time/place/persons
|
|
215
|
+
- **Investigator triangulation** — different authors with no shared
|
|
216
|
+
funding/employer
|
|
217
|
+
- **Theoretical triangulation** — different theoretical framings reach
|
|
218
|
+
the same conclusion
|
|
219
|
+
- **Methodological triangulation** — different methods (survey vs
|
|
220
|
+
interview vs telemetry) converge
|
|
221
|
+
|
|
222
|
+
Republication chains and citation cascades count as **one** source.
|
|
223
|
+
`scripts/verify-citations.sh` flags suspected republication via shared DOM
|
|
224
|
+
fingerprints + ownership trees.
|
|
225
|
+
|
|
226
|
+
## Scripts (`.claude/skills/research/scripts/`)
|
|
227
|
+
|
|
228
|
+
| Script | Purpose |
|
|
229
|
+
| --------------------- | ---------------------------------------------------------------------------------------------------- |
|
|
230
|
+
| `check-cache.sh` | Slugify topic, scan `/docs/research/`, classify content-type bucket, return reuse/delta/full verdict |
|
|
231
|
+
| `verify-citations.sh` | Per citation: HTTP 200, quote grep, DOI Crossref check, write verify.json |
|
|
232
|
+
| `dedup-research.sh` | Detect overlap between docs (jaccard on concept lists + citation overlap), suggest merge |
|
|
233
|
+
| `update-index.sh` | Regenerate `/docs/research/index.md` + per-folder indexes from frontmatter |
|
|
234
|
+
| `extract-claims.py` | Pull atomic claims with citations from a rendered doc into JSONL |
|
|
235
|
+
|
|
236
|
+
## Templates (`.claude/skills/research/templates/`)
|
|
237
|
+
|
|
238
|
+
| Template | Output |
|
|
239
|
+
| ---------------------------- | ------------------------------------------- |
|
|
240
|
+
| `research.md.tpl` | Main `/docs/research/<slug>.md` deliverable |
|
|
241
|
+
| `adr.md.tpl` | Nygard ADR for decision questions |
|
|
242
|
+
| `moc.md.tpl` | Map of Content for cross-topic themes |
|
|
243
|
+
| `index.md.tpl` | TOC for `/docs/research/index.md` |
|
|
244
|
+
| `research-state.schema.json` | Schema for state JSONL entries |
|
|
245
|
+
|
|
246
|
+
## References (read on demand)
|
|
247
|
+
|
|
248
|
+
- `references/research-methodology.md` — the 15-topic deep methodology bible
|
|
249
|
+
- `references/ontology-patterns.md` — SKOS-adapted relationship vocabulary + LLM-friendly ontology examples
|
|
250
|
+
- `references/source-directory.md` — per-domain authoritative sources, authority hierarchies, AI-content red flags
|
|
251
|
+
- `references/domain-playbooks.md` — step-by-step protocols per research domain (UX, library eval, API, ADR, market, academic, news, security, pricing)
|
|
252
|
+
|
|
253
|
+
## Hard rules
|
|
254
|
+
|
|
255
|
+
1. **Cache first**. Never burn query tokens on a fresh cache hit.
|
|
256
|
+
2. **Report-then-ask**. After scout, STOP for user confirmation before query.
|
|
257
|
+
3. **Every claim has URL+QUOTE+ACCESSED-AT+VERIFY-METHOD**. Verify gate fails closed on violations.
|
|
258
|
+
4. **No fabricated citations, ever**. If a quote cannot be greppped in the fetched page, the claim is dropped.
|
|
259
|
+
5. **Triangulate by Denzin type, not raw count**. 3 republications of one wire story = 1 source.
|
|
260
|
+
6. **Content-type freshness**. Don't apply fast-bucket aging to slow-bucket topics or vice versa.
|
|
261
|
+
7. **Output to `/docs/research/`** — never to `.claude/skills/research-cache/` (legacy).
|
|
262
|
+
8. **English output by default** — even when triggered in Portuguese. Override with `--lang pt`.
|
|
263
|
+
9. **Summary to user ≤5 sentences**. Doc body lives in the file.
|
|
264
|
+
10. **Skill ⊥ super-design ⊥ e2e-audit**. If the user asked for a UX audit or test audit, hand off — do not improvise.
|
|
265
|
+
|
|
266
|
+
## Boundaries (what this skill does NOT do)
|
|
267
|
+
|
|
268
|
+
- Does NOT implement code based on its own findings. Hand the doc to the user / implementing agent.
|
|
269
|
+
- Does NOT replace `super-design` for design audits or `e2e-audit` for test audits.
|
|
270
|
+
- Does NOT publish research outside `/docs/research/`.
|
|
271
|
+
- Does NOT invent fixtures or scrape behind paywalls.
|
|
272
|
+
- Does NOT bypass robots.txt or honor restrictions on cited sites.
|
|
273
|
+
|
|
274
|
+
## Invocation triggers (enforced by SessionStart hook)
|
|
275
|
+
|
|
276
|
+
EN: `research`, `investigate`, `find info`, `search for`, `look up`,
|
|
277
|
+
`evaluate`, `compare`, `audit literature`, `competitor analysis`,
|
|
278
|
+
`market research`, `library evaluation`, `prior art`.
|
|
279
|
+
|
|
280
|
+
PT: `pesquisar`, `pesquisa`, `pesquise`, `investigar`, `buscar info`,
|
|
281
|
+
`procurar info`, `comparar`, `avaliar biblioteca`, `análise de
|
|
282
|
+
mercado`, `análise de concorrentes`.
|
|
283
|
+
|
|
284
|
+
The hook injects this context at session start. Claude must read this
|
|
285
|
+
SKILL.md before improvising a research plan.
|