start-vibing 4.4.14 → 4.4.16

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,46 +1,33 @@
1
1
  ---
2
2
  name: research
3
- version: 0.1.0
3
+ version: 0.2.0
4
4
  description: >
5
- Performs Baymard-Institute-grade research on any topic the user asks about
6
- (UX patterns, library evaluation, market analysis, academic literature, API
7
- integration, architectural decisions). MUST BE USED when the user mentions
8
- research, investigate, find info, search for, look up, pesquisar, pesquisa,
9
- pesquise, investigar, or asks to evaluate / compare / understand any
10
- technology, framework, vendor, methodology, or domain. Source-first
11
- pipeline: scout query → synthesize → verify. Output goes to
12
- /docs/research/<topic>.md with URL+QUOTE+ACCESSED-AT+VERIFY-METHOD evidence
13
- per claim. Re-uses cached research when fresh, calibrated by content-type.
5
+ Performs source-first research on any topic the user asks about (UX patterns,
6
+ library evaluation, market analysis, academic literature, API integration,
7
+ architectural decisions). MUST BE USED when the user mentions research,
8
+ investigate, find info, search for, look up, pesquisar, pesquisa, pesquise,
9
+ investigar, or asks to evaluate / compare / understand any technology,
10
+ framework, vendor, methodology, or domain. Four-agent pipeline: scout →
11
+ query (parallel fan-out) → synthesize → verify. Output is a developer-readable
12
+ engineering-blog briefing at /docs/research/<topic>.md TL;DR-first, bolded-bullet
13
+ findings, embedded hyperlink citations. Every claim ships URL+QUOTE+ACCESSED-AT+VERIFY-METHOD
14
+ evidence. Re-uses cached research when fresh, calibrated by content-type.
14
15
  ---
15
16
 
16
- # research — evidence-backed knowledge production
17
+ # research — evidence-backed knowledge production for developers
17
18
 
18
- > **Operating principle**: every claim in research output must be defensible
19
- > to a skeptical stakeholder. URL resolves, quote is in source, source is
20
- > independent. Fabricated citations are the worst possible failure mode
21
- > the verify agent fails closed on them.
19
+ > **Operating principle**: every claim in research output must be defensible to a skeptical engineer. URL resolves, quote is in source, source is independent. Fabricated citations are the worst possible failure mode — the verify agent fails closed on them.
20
+ >
21
+ > **Output principle**: the deliverable is an engineering-blog briefing, not an academic paper. Lead with the verdict. Use bolded bullets, not numbered subsections with paragraphs. Embed hyperlinks in prose. No ontology maps, no SKOS, no 30-item concept frontmatter. Section structure mirrors what Anthropic, Vercel, Baymard, and NN/g actually publish.
22
22
 
23
23
  ## What this skill does
24
24
 
25
25
  Four-phase pipeline with 4 specialist agents:
26
26
 
27
- 1. **Scout** (research-scout, Haiku) — decomposes the user's question, checks
28
- `/docs/research/` for existing fresh findings (content-type-calibrated
29
- freshnessfast/medium/slow/permanent buckets), proposes a scoped research
30
- plan + estimated query budget. Hands directly to Query NO confirmation
31
- gate (auto-proceed). User can interrupt mid-run if needed.
32
- 2. **Query** (research-query, Sonnet) — executes web/library searches in
33
- parallel, fetches pages via WebFetch + context7 (for library docs),
34
- extracts atomic claims with URL+QUOTE+ACCESSED-AT evidence, dumps to
35
- `claims.jsonl`.
36
- 3. **Synthesize** (research-synthesize, Sonnet) — builds a lightweight
37
- SKOS-adapted ontology, triangulates each claim across ≥3 independent
38
- sources (Denzin's 4 types, not just count), produces final
39
- `/docs/research/<topic>.md` and updates `index.md`.
40
- 4. **Verify** (research-verify, Haiku) — anti-hallucination gate. For every
41
- citation in the final doc: resolves URL, greps the literal quote, checks
42
- DOI via Crossref. Fails closed on any unverified citation. Writes
43
- `verify.json` with per-citation status.
27
+ 1. **Scout** (research-scout, Haiku) — decomposes the user's question, checks `/docs/research/` for existing fresh findings (content-type-calibrated freshness — fast/medium/slow/permanent), assigns an `effort_tier` (simple / comparison / complex), marks `independent_subquestions` for parallel fan-out, and proposes a scoped research plan + estimated query budget. Hands directly to Query — NO confirmation gate (auto-proceed).
28
+ 2. **Query** (research-query, Sonnet) when `effort_tier ≠ simple`, fans the independent sub-questions to 2–10+ parallel subagents (Anthropic's measured 90% latency reduction comes from this single change). Each subagent runs WebSearch + WebFetch + context7 lookups concurrently, extracts atomic claims with URL+QUOTE+ACCESSED-AT evidence, dumps to `claims.jsonl`. Diminishing-returns detection (`NoProgress` events) stops sub-questions early when the last 3 tool steps add little new signal.
29
+ 3. **Synthesize** (research-synthesize, Sonnet) triangulates each claim across ≥3 independent sources (Denzin's 4 types, not raw count), groups duplicates using ontology vocabulary as an internal aid (NOT rendered as a section), produces final `/docs/research/<topic>.md` from `templates/research.md.tpl` in engineering-blog format. Updates `index.md` and any MOCs.
30
+ 4. **Verify** (research-verify, Haiku) — anti-hallucination gate. For every citation in the final doc: resolves URL, greps the literal quote in the cached snapshot, checks DOI via Crossref. Fails closed on any unverified citation. Writes `verify.json` with per-citation status.
44
31
 
45
32
  ## Entry flow
46
33
 
@@ -48,8 +35,8 @@ Four-phase pipeline with 4 specialist agents:
48
35
 
49
36
  ```bash
50
37
  TOPIC_SLUG=$(echo "$USER_QUESTION" | bash .claude/skills/research/scripts/check-cache.sh --slugify)
51
- SESSION_DIR="docs/research/.cache/sessions/$(date +%Y-%m-%d-%H%M%S)-$TOPIC_SLUG"
52
- mkdir -p "$SESSION_DIR"
38
+ SESSION_DIR="docs/research/.cache/sessions/$(date +%Y%m%d%H%M%S)-$TOPIC_SLUG"
39
+ mkdir -p "$SESSION_DIR/snapshots"
53
40
 
54
41
  bash .claude/skills/research/scripts/check-cache.sh \
55
42
  --topic "$TOPIC_SLUG" \
@@ -57,104 +44,112 @@ bash .claude/skills/research/scripts/check-cache.sh \
57
44
  > "$SESSION_DIR/cache-check.json"
58
45
  ```
59
46
 
60
- `cache-check.json` reports: existing doc path (if any), age in days,
61
- content-type bucket (fast/medium/slow/permanent), freshness verdict
62
- (fresh / aging / stale / outdated), recommended action
63
- (reuse | delta-update | full-research).
47
+ `cache-check.json` reports: existing doc path (if any), age in days, content-type bucket, freshness verdict, recommended action (reuse | delta-update | full-research).
64
48
 
65
- If verdict is `reuse`, return the existing doc path to the user and exit. Do
66
- not burn query tokens on a cache hit.
49
+ If verdict is `reuse`, return the existing doc path to the user and exit. Do not burn query tokens on a cache hit.
67
50
 
68
51
  ### Step 2 — Scout (Task tool → research-scout)
69
52
 
70
- Pass `cache-check.json` + the user question. Scout returns
71
- `scout-plan.json`:
53
+ Pass `cache-check.json` + the user question. Scout returns `scout-plan.json`:
72
54
 
73
55
  ```jsonc
74
56
  {
75
57
  "topic_slug": "react-server-components-data-fetching",
76
58
  "question": "...",
77
59
  "decomposition": ["sub-q1", "sub-q2", "..."],
60
+ "independent_subquestions": [0, 1, 2], // drives parallel fan-out
78
61
  "domain": "software-engineering",
79
- "playbook": "library-evaluation", // from references/domain-playbooks.md
80
- "content_type_bucket": "fast", // fast | medium | slow | permanent
81
- "estimated_queries": 12,
62
+ "playbook": "library-evaluation",
63
+ "content_type_bucket": "fast",
64
+ "effort_tier": "comparison", // simple | comparison | complex
65
+ "estimated_queries": 14,
82
66
  "estimated_minutes": 8,
83
- "cache_strategy": "delta-update", // from cache-check.json
67
+ "cache_strategy": "delta-update",
68
+ "decision_required": true,
84
69
  "blockers": [],
85
70
  }
86
71
  ```
87
72
 
88
73
  ### Step 3 — Auto-proceed (no confirmation gate)
89
74
 
90
- Print a ≤4-line summary for visibility, then immediately dispatch Query.
91
- Do NOT wait for user confirmation. The user can `/cancel` or interrupt
92
- mid-run if scope is wrong.
75
+ Print a ≤4-line summary for visibility, then immediately dispatch Query. Do NOT wait for user confirmation. The user can interrupt mid-run if scope is wrong.
93
76
 
94
77
  ```
95
- Topic: <slug> · Plan: <N> sub-questions, ~<Q> queries, ~<M> min
96
- Cache: <strategy> · Proceeding to query...
78
+ Topic: <slug> · Tier: <effort_tier> · Plan: <N> sub-questions, ~<Q> queries, ~<M> min
79
+ Cache: <strategy> · Fan-out: <K> parallel subagents · Proceeding to query...
97
80
  ```
98
81
 
99
82
  Exception: if `--dry-run` flag is set, stop here and return scout-plan.json.
100
83
 
101
84
  ### Step 4 — Query (Task tool → research-query)
102
85
 
103
- Dispatch with `scout-plan.json`. Agent runs parallel searches, writes
104
- `claims.jsonl` (one atomic claim per line) and `sources.jsonl` (one source
105
- per line, with accessed-at timestamp). Each claim references its source by
106
- ID. Hard rule: every claim has at least one verbatim quote from its source.
86
+ Dispatch with `scout-plan.json`. Agent picks execution shape from `effort_tier`:
87
+
88
+ | `effort_tier` | Pattern | Concurrency | Tool calls per sub-q |
89
+ | ------------- | ------------------------------- | ----------- | -------------------- |
90
+ | `simple` | Single executor, no fan-out | 1 agent | 3–10 |
91
+ | `comparison` | Lead + 2–4 parallel subagents | 2–4 | 10–15 each |
92
+ | `complex` | Lead + 5–10+ parallel subagents | 5–10+ | varies |
93
+
94
+ Anthropic's verbatim heuristic: _"Simple fact-finding requires just 1 agent with 3-10 tool calls, direct comparisons might need 2-4 subagents with 10-15 calls each, and complex research might use more than 10 subagents"_ ([Anthropic Engineering](https://www.anthropic.com/engineering/multi-agent-research-system)).
95
+
96
+ Why this matters for quality, not just speed: same post — _"token usage by itself explains 80% of the variance, with the number of tool calls and the model choice as the two other explanatory factors"_. Parallelization is a quality lever.
97
+
98
+ Agent writes `claims.jsonl` (one atomic claim per line) and `sources.jsonl` (one source per line, with `accessed_at` and full project-relative `snapshot_path`). Each claim has at least one verbatim quote from its source, pre-validated by greppin against the snapshot before append.
107
99
 
108
100
  ### Step 5 — Synthesize (Task tool → research-synthesize)
109
101
 
110
102
  Dispatch with `claims.jsonl` + `sources.jsonl`. Agent:
111
103
 
112
- 1. Builds ontology from `references/ontology-patterns.md` vocabulary.
113
- 2. Triangulates: groups claims by assertion, requires ≥3 INDEPENDENT
114
- sources (Denzin types, not just count) for high-confidence claims.
115
- 3. Renders `/docs/research/<topic-slug>.md` from
116
- `templates/research.md.tpl`.
117
- 4. Writes ADR to `/docs/research/decisions/NNNN-<slug>.md` if the
118
- user's question implies a decision.
119
- 5. Calls `scripts/update-index.sh` to regenerate `/docs/research/index.md`
120
- and any MOCs.
104
+ 1. Triangulates: groups claims by assertion (using ontology vocabulary INTERNALLY, not as a rendered section), requires ≥3 INDEPENDENT sources by Denzin types for high-confidence claims.
105
+ 2. Renders `/docs/research/<topic-slug>.md` from `templates/research.md.tpl` in engineering-blog format:
106
+ - Frontmatter (minimal `concepts:` cap at 8)
107
+ - **TL;DR** with verdict-first lead + 5–7 bolded-verdict bullets
108
+ - **Why this matters** (2–4 prose paragraphs)
109
+ - **What we found** (flat bolded bullets, ONE LINE EACH — not numbered subsections + paragraphs + blockquotes)
110
+ - **Where the evidence disagrees** (only if disagreements exist)
111
+ - **Trade-offs** (replaces DO/AVOID; honest single section)
112
+ - **Open questions** (only if any)
113
+ - **Sources** table at bottom
114
+ 3. Citation style: embedded hyperlinks in prose by default. Numeric `[1]` only on explicit user request or 4+ source overflow.
115
+ 4. Writes ADR to `/docs/research/decisions/NNNN-<slug>.md` if `decision_required`.
116
+ 5. Calls `scripts/update-index.sh` to regenerate `/docs/research/index.md` and any MOCs.
117
+
118
+ DROPPED from the v0.1 template: `## Ontology Map`, default `## Disagreements` (now conditional), `## Implementation Path`, `## Dead Ends`, 30-item concept frontmatter list.
121
119
 
122
120
  ### Step 6 — Verify (Task tool → research-verify)
123
121
 
124
- Dispatch with the rendered doc. Agent runs
125
- `scripts/verify-citations.sh <doc>` which:
122
+ Dispatch with the rendered doc. Agent runs `scripts/verify-citations.sh <doc> <session_dir>` which:
126
123
 
127
- - For each `Source` row → fetches URL, checks HTTP 200, greps the
128
- associated quote.
124
+ - For each `Source` row → fetches URL, checks HTTP 200, greps the associated quote against `snapshot_path`.
129
125
  - For DOIs → hits Crossref API.
130
126
  - Writes `verify.json` to the session dir.
131
127
  - Returns non-zero on any failed citation.
132
128
 
133
- If verify fails, the synthesize agent is re-dispatched with the failure
134
- report to fix or remove unverifiable claims. Three failed verify rounds →
135
- abort and surface findings to the user.
129
+ If verify fails, the synthesize agent is re-dispatched with the failure report to fix or remove unverifiable claims. Three failed verify rounds → abort and surface findings to the user.
136
130
 
137
131
  ### Step 7 — Persist + summarize
138
132
 
139
133
  ```bash
140
134
  bash .claude/skills/research/scripts/update-index.sh
141
- echo "$TOPIC_SLUG $(date -u +%Y-%m-%dT%H:%M:%SZ) $VERIFY_STATUS" \
135
+ echo "{\"topic\":\"$TOPIC_SLUG\",\"timestamp\":\"$(date -u +%Y-%m-%dT%H:%M:%SZ)\",\"verify_status\":\"$VERDICT\",\"pass\":$N_PASS,\"stale\":$N_STALE,\"fail\":$N_FAIL}" \
142
136
  >> docs/research/.research-state.jsonl
143
137
  ```
144
138
 
145
- Return ≤5 sentences to the user: doc path, claim count, sources cited,
146
- confidence levels, open questions count. Do NOT paste the doc body.
139
+ Return ≤5 sentences to the user: doc path, finding count, sources cited, confidence levels, open questions count. Do NOT paste the doc body.
147
140
 
148
141
  ## User flags
149
142
 
150
143
  - `--force-fresh` — ignore cache, full research even if fresh exists
151
144
  - `--delta-only` — only update sections that changed since the cached version
152
145
  - `--scope <bucket>` — narrow content-type bucket (fast | medium | slow | permanent)
153
- - `--playbook <name>` — override playbook detection (from `references/domain-playbooks.md`)
146
+ - `--playbook <name>` — override playbook detection
147
+ - `--effort <tier>` — override effort tier (simple | comparison | complex)
154
148
  - `--no-verify` — skip verify gate (NOT recommended; only for offline runs)
155
149
  - `--lang <code>` — output language (default: `en`; accepts `pt`, `es`, etc.)
156
- - `--max-queries <N>` — cap total queries (default 20)
150
+ - `--max-queries <N>` — cap total queries (overrides scout estimate)
157
151
  - `--dry-run` — produce scout-plan.json then stop
152
+ - `--cite-style <style>` — `inline-hyperlink` (default) | `numeric-footnote`
158
153
 
159
154
  ## Output layout
160
155
 
@@ -174,14 +169,14 @@ docs/research/
174
169
  ├── scout-plan.json
175
170
  ├── claims.jsonl
176
171
  ├── sources.jsonl
172
+ ├── progress.log # NoProgress events from query
177
173
  ├── verify.json
178
- └── snapshots/<n>.html # WebFetched page caches for grep
174
+ └── snapshots/<n>.md # WebFetched page caches for grep
179
175
  ```
180
176
 
181
177
  ## Evidence protocol — URL+QUOTE+ACCESSED-AT+VERIFY-METHOD
182
178
 
183
- Adapted from super-design's SHOT+QUOTE+SEL+VAL. Every non-meta claim in the
184
- output ships:
179
+ Every non-meta claim in the output ships:
185
180
 
186
181
  | Field | Meaning |
187
182
  | ----------------- | ---------------------------------------------------------------- |
@@ -190,8 +185,7 @@ output ships:
190
185
  | **ACCESSED-AT** | UTC ISO-8601 timestamp of the fetch |
191
186
  | **VERIFY-METHOD** | `web-fetch` / `crossref-api` / `screenshot` / `archive-snapshot` |
192
187
 
193
- `scripts/verify-citations.sh` enforces this contract. Coverage-gap and
194
- "open question" findings are exempt (no claim to verify).
188
+ `scripts/verify-citations.sh` enforces this contract. Coverage-gap and "open question" findings are exempt (no claim to verify).
195
189
 
196
190
  ## Freshness — content-type buckets (NOT one-size)
197
191
 
@@ -206,46 +200,62 @@ Bucket detection lives in `scripts/check-cache.sh`. Override with `--scope`.
206
200
 
207
201
  ## Triangulation — Denzin's 4 types, not "3 sources"
208
202
 
209
- Per `references/research-methodology.md` §4. A claim achieves
210
- **high-confidence** only when it survives ≥3 INDEPENDENT sources where
211
- "independent" means satisfying ≥1 of:
203
+ A claim achieves **high-confidence** only when it survives ≥3 INDEPENDENT sources where "independent" means satisfying ≥1 of:
212
204
 
213
205
  - **Data triangulation** — different time/place/persons
214
- - **Investigator triangulation** — different authors with no shared
215
- funding/employer
216
- - **Theoretical triangulation** — different theoretical framings reach
217
- the same conclusion
218
- - **Methodological triangulation** different methods (survey vs
219
- interview vs telemetry) converge
206
+ - **Investigator triangulation** — different authors with no shared funding/employer
207
+ - **Theoretical triangulation** — different theoretical framings reach the same conclusion
208
+ - **Methodological triangulation** — different methods (survey vs interview vs telemetry) converge
209
+
210
+ Republication chains and citation cascades count as **one** source. The verify gate flags suspected republication via shared DOM fingerprints + ownership trees.
211
+
212
+ **Render rule (synthesize)**: confidence is rendered as a parenthetical next to each finding bullet — `_[high — Anthropic + Vercel + NN/g]_`. NOT as a separate methodology box, NOT as an Ontology Map, NOT as a triangulation matrix.
213
+
214
+ ## Diminishing-returns detection (research-query)
215
+
216
+ After every batch of fetches, query agent evaluates whether the last 3 tool steps produced new signal:
217
+
218
+ - New non-boilerplate tokens added to claims.jsonl in last 3 steps < 500
219
+ - Tool work in last 3 steps < $0.01
220
+
221
+ If both below threshold → emit `NoProgress` event to `$SESSION_DIR/progress.log`. After **2 consecutive `NoProgress` events** on a sub-question → terminate that sub-question. After **4 consecutive** across the run → terminate research entirely and hand off to synthesize.
222
+
223
+ Thresholds are illustrative — calibrate per project.
224
+
225
+ ## Length scales with `effort_tier`
226
+
227
+ | `effort_tier` | Target lines | Sections expected |
228
+ | ------------- | ------------ | ------------------------------------------- |
229
+ | `simple` | 80–150 | TL;DR · What we found · Sources |
230
+ | `comparison` | 150–280 | + Why this matters · Trade-offs |
231
+ | `complex` | 280–450 | + Where evidence disagrees · Open questions |
220
232
 
221
- Republication chains and citation cascades count as **one** source.
222
- `scripts/verify-citations.sh` flags suspected republication via shared DOM
223
- fingerprints + ownership trees.
233
+ Going under target = not enough info; over target = reader bails. Anchor on these.
224
234
 
225
235
  ## Scripts (`.claude/skills/research/scripts/`)
226
236
 
227
237
  | Script | Purpose |
228
238
  | --------------------- | ---------------------------------------------------------------------------------------------------- |
229
239
  | `check-cache.sh` | Slugify topic, scan `/docs/research/`, classify content-type bucket, return reuse/delta/full verdict |
230
- | `verify-citations.sh` | Per citation: HTTP 200, quote grep, DOI Crossref check, write verify.json |
240
+ | `verify-citations.sh` | Per citation: HTTP 200, quote grep against `snapshot_path`, DOI Crossref check, write verify.json |
231
241
  | `dedup-research.sh` | Detect overlap between docs (jaccard on concept lists + citation overlap), suggest merge |
232
242
  | `update-index.sh` | Regenerate `/docs/research/index.md` + per-folder indexes from frontmatter |
233
243
  | `extract-claims.py` | Pull atomic claims with citations from a rendered doc into JSONL |
234
244
 
235
245
  ## Templates (`.claude/skills/research/templates/`)
236
246
 
237
- | Template | Output |
238
- | ---------------------------- | ------------------------------------------- |
239
- | `research.md.tpl` | Main `/docs/research/<slug>.md` deliverable |
240
- | `adr.md.tpl` | Nygard ADR for decision questions |
241
- | `moc.md.tpl` | Map of Content for cross-topic themes |
242
- | `index.md.tpl` | TOC for `/docs/research/index.md` |
243
- | `research-state.schema.json` | Schema for state JSONL entries |
247
+ | Template | Output |
248
+ | ---------------------------- | ------------------------------------------------------------------ |
249
+ | `research.md.tpl` | Main `/docs/research/<slug>.md` engineering-blog briefing format |
250
+ | `adr.md.tpl` | Nygard ADR for decision questions |
251
+ | `moc.md.tpl` | Map of Content for cross-topic themes |
252
+ | `index.md.tpl` | TOC for `/docs/research/index.md` |
253
+ | `research-state.schema.json` | Schema for state JSONL entries |
244
254
 
245
255
  ## References (read on demand)
246
256
 
247
- - `references/research-methodology.md` — the 15-topic deep methodology bible
248
- - `references/ontology-patterns.md` — SKOS-adapted relationship vocabulary + LLM-friendly ontology examples
257
+ - `references/research-methodology.md` — methodology (triangulation, freshness, query engineering)
258
+ - `references/ontology-patterns.md` — INTERNAL grouping vocabulary for synthesize (NOT rendered as a section)
249
259
  - `references/source-directory.md` — per-domain authoritative sources, authority hierarchies, AI-content red flags
250
260
  - `references/domain-playbooks.md` — step-by-step protocols per research domain (UX, library eval, API, ADR, market, academic, news, security, pricing)
251
261
 
@@ -254,13 +264,20 @@ fingerprints + ownership trees.
254
264
  1. **Cache first**. Never burn query tokens on a fresh cache hit.
255
265
  2. **Auto-proceed after scout**. Print summary, immediately dispatch query. NO confirmation gate. User can interrupt mid-run.
256
266
  3. **Every claim has URL+QUOTE+ACCESSED-AT+VERIFY-METHOD**. Verify gate fails closed on violations.
257
- 4. **No fabricated citations, ever**. If a quote cannot be greppped in the fetched page, the claim is dropped.
267
+ 4. **No fabricated citations, ever**. If a quote cannot be greppded in the fetched page, the claim is dropped.
258
268
  5. **Triangulate by Denzin type, not raw count**. 3 republications of one wire story = 1 source.
259
269
  6. **Content-type freshness**. Don't apply fast-bucket aging to slow-bucket topics or vice versa.
260
270
  7. **Output to `/docs/research/`** — never to `.claude/skills/research-cache/` (legacy).
261
271
  8. **English output by default** — even when triggered in Portuguese. Override with `--lang pt`.
262
272
  9. **Summary to user ≤5 sentences**. Doc body lives in the file.
263
273
  10. **Skill ⊥ super-design ⊥ e2e-audit**. If the user asked for a UX audit or test audit, hand off — do not improvise.
274
+ 11. **No Ontology Map section in the rendered doc.** Ontology vocabulary is internal-only.
275
+ 12. **No 30+ concept frontmatter list.** Cap at 8.
276
+ 13. **Findings render as flat bolded bullets**, not numbered subsections with paragraphs + blockquotes.
277
+ 14. **Citations default to embedded hyperlinks in prose.** `[Anthropic Engineering](URL)` style. Numeric `[1]` is opt-in via `--cite-style numeric-footnote`.
278
+ 15. **Parallel fan-out is the norm** for `comparison` and `complex` tiers. Serial execution only for `simple`.
279
+ 16. **Honor diminishing-returns detection.** 2 consecutive NoProgress per sub-q → stop that sub-q; 4 across the run → stop research.
280
+ 17. **Length scales with `effort_tier`.** Don't pad, don't truncate.
264
281
 
265
282
  ## Boundaries (what this skill does NOT do)
266
283
 
@@ -272,13 +289,8 @@ fingerprints + ownership trees.
272
289
 
273
290
  ## Invocation triggers (enforced by SessionStart hook)
274
291
 
275
- EN: `research`, `investigate`, `find info`, `search for`, `look up`,
276
- `evaluate`, `compare`, `audit literature`, `competitor analysis`,
277
- `market research`, `library evaluation`, `prior art`.
292
+ EN: `research`, `investigate`, `find info`, `search for`, `look up`, `evaluate`, `compare`, `audit literature`, `competitor analysis`, `market research`, `library evaluation`, `prior art`.
278
293
 
279
- PT: `pesquisar`, `pesquisa`, `pesquise`, `investigar`, `buscar info`,
280
- `procurar info`, `comparar`, `avaliar biblioteca`, `análise de
281
- mercado`, `análise de concorrentes`.
294
+ PT: `pesquisar`, `pesquisa`, `pesquise`, `investigar`, `buscar info`, `procurar info`, `comparar`, `avaliar biblioteca`, `análise de mercado`, `análise de concorrentes`.
282
295
 
283
- The hook injects this context at session start. Claude must read this
284
- SKILL.md before improvising a research plan.
296
+ The hook injects this context at session start. Claude must read this SKILL.md before improvising a research plan.
@@ -7,111 +7,81 @@ content_type_bucket: "{{BUCKET}}" # fast | medium | slow | permanent
7
7
  freshness: "{{FRESHNESS}}" # fresh | aging | stale | outdated
8
8
  freshness_window_days: {{WINDOW_DAYS}}
9
9
  playbook: "{{PLAYBOOK}}"
10
- domain: "{{DOMAIN}}"
11
10
  sources_count: {{SOURCES_COUNT}}
12
11
  findings_count: {{FINDINGS_COUNT}}
13
- disagreements_count: {{DISAGREEMENTS_COUNT}}
14
- open_questions_count: {{OPEN_Q_COUNT}}
15
12
  confidence_summary: "{{CONFIDENCE_SUMMARY}}" # e.g. "5 high · 3 medium · 1 low"
16
- concepts:
17
- {{CONCEPTS_YAML_LIST}}
18
- session_id: "{{SESSION_ID}}"
13
+ concepts: # CAP at 8 — primary search keywords only
14
+ {{CONCEPTS_YAML_LIST_MAX_8}}
19
15
  ---
20
16
 
21
- # Research: {{TITLE}}
17
+ # {{TITLE}}
22
18
 
23
19
  > Bucket: **{{BUCKET}}** · Status: **{{FRESHNESS}}** · Confidence: {{CONFIDENCE_SUMMARY}}
24
20
  > Session: `docs/research/.cache/sessions/{{SESSION_ID}}/`
25
21
 
26
- ## Executive summary
27
-
28
- {{EXEC_SUMMARY}}
29
-
30
- ## Question
31
-
32
- {{ORIGINAL_QUESTION}}
33
-
34
- ## Ontology Map
22
+ ---
35
23
 
36
- Concepts and their relationships using the SKOS-adapted vocabulary
37
- (see `.claude/skills/research/references/ontology-patterns.md`).
24
+ ## TL;DR {{TLDR_HEADLINE}}
38
25
 
39
- ```
40
- {{ONTOLOGY_RELATIONS}}
41
- ```
26
+ {{TLDR_LEAD_PARAGRAPH_1_TO_3_SENTENCES}}
42
27
 
43
- ## Findings
28
+ {{#each TLDR_BULLET}}
29
+ {{N}}. **{{VERDICT_PHRASE}}.** {{ONE_SENTENCE_RATIONALE}} ({{INLINE_CITATION}})
30
+ {{/each}}
44
31
 
45
- {{#each FINDING}}
46
- ### Finding {{ID}} — {{TITLE}}
32
+ ---
47
33
 
48
- {{ASSERTION}}
34
+ ## Why this matters
49
35
 
50
- **Confidence:** {{CONFIDENCE}}{{#if FRESHNESS_WARNING}} · _Freshness warning: {{FRESHNESS_WARNING}}_{{/if}}{{#if TRIANGULATION_WARNING}} · _Triangulation: {{TRIANGULATION_WARNING}}_{{/if}}
36
+ {{CONTEXT_2_TO_4_PARAGRAPHS ground the reader in the problem the research actually addresses; cite the originating constraint or pain point. Keep it engineering-blog tone. NO methodology box, NO triangulation discussion here.}}
51
37
 
52
- Evidence:
38
+ ---
53
39
 
54
- {{#each EVIDENCE}}
55
- > "{{QUOTE}}" [{{SOURCE_ID}}]
56
- > URL: {{URL}}
57
- > Accessed: {{ACCESSED_AT}}
58
- > Verify: {{VERIFY_METHOD}}
59
- {{/each}}
40
+ ## What we found
60
41
 
42
+ {{#each FINDING}}
43
+ - **{{ASSERTION_AS_VERDICT}}** — {{ONE_OR_TWO_SENTENCE_EVIDENCE_SUMMARY_WITH_INLINE_HYPERLINK}}. _[{{CONFIDENCE}} — {{TRIANGULATION_TAGS}}]_
61
44
  {{/each}}
62
45
 
63
- ## Disagreements
64
-
65
- {{#each DISAGREEMENT}}
66
- ### {{TOPIC}}
67
-
68
- - **Position A** ([{{SRC_A}}]): {{POSITION_A}}
69
- - **Position B** ([{{SRC_B}}]): {{POSITION_B}}
70
- - **Resolution requires:** {{RESOLUTION_HINT}}
71
- {{/each}}
46
+ > Use bolded-bullet flat format. Do NOT use the heavy "### Finding N" / paragraph / blockquote / confidence-label pattern.
47
+ > Inline citation style: embedded hyperlink in the prose — `[Anthropic Engineering](URL)` — NOT bracketed numerics. Reserve `[1]` numerics only when the user explicitly requested footnote style.
72
48
 
73
- ## Recommendations
49
+ {{#if DISAGREEMENTS_EXIST}}
50
+ ---
74
51
 
75
- ### DO
52
+ ## Where the evidence disagrees
76
53
 
77
- {{#each RECOMMENDATION_DO}}
78
- - {{TEXT}} _{{REASON}}_
54
+ {{#each DISAGREEMENT}}
55
+ - **{{TOPIC}}**: [{{SRC_A_LABEL}}]({{SRC_A_URL}}) says "{{POSITION_A}}". [{{SRC_B_LABEL}}]({{SRC_B_URL}}) says "{{POSITION_B}}". Resolution would require {{RESOLUTION_HINT}}.
79
56
  {{/each}}
57
+ {{/if}}
80
58
 
81
- ### AVOID
59
+ ---
82
60
 
83
- {{#each RECOMMENDATION_AVOID}}
84
- - {{TEXT}} — _{{REASON}}_
85
- {{/each}}
61
+ ## Trade-offs
86
62
 
87
- ## Implementation Path
63
+ {{TRADE_OFFS_2_TO_5_BULLETS what the recommended approach gives up, NOT a separate "AVOID" list. Frame as "Choosing X means losing Y." Each bullet cites at least one source.}}
88
64
 
89
- {{#each STEP}}
90
- {{N}}. {{TEXT}}
91
- {{/each}}
65
+ {{#if OPEN_QUESTIONS_EXIST}}
66
+ ---
92
67
 
93
- ## Open Questions
68
+ ## Open questions
94
69
 
95
70
  {{#each OPEN_Q}}
96
71
  - {{TEXT}}
97
72
  {{/each}}
73
+ {{/if}}
98
74
 
99
- ## Dead Ends
100
-
101
- _Searched but not found / not applicable_
102
-
103
- {{#each DEAD_END}}
104
- - {{TEXT}}
105
- {{/each}}
75
+ ---
106
76
 
107
77
  ## Sources
108
78
 
109
- | ID | Title | Publisher | Authority (1-5) | Independence | Accessed | URL |
110
- |----|-------|-----------|-----------------|--------------|----------|-----|
79
+ | ID | Title | Publisher | Authority | Independence | Accessed |
80
+ |----|-------|-----------|-----------|--------------|----------|
111
81
  {{#each SOURCE}}
112
- | {{ID}} | {{TITLE}} | {{PUBLISHER}} | {{AUTHORITY_LEVEL}} | {{INDEPENDENCE}} | {{ACCESSED_AT}} | {{URL}} |
82
+ | {{ID}} | [{{TITLE}}]({{URL}}) | {{PUBLISHER}} | {{AUTHORITY_LEVEL}}/5 | {{INDEPENDENCE}} | {{ACCESSED_AT}} |
113
83
  {{/each}}
114
84
 
115
85
  ---
116
86
 
117
- _Generated by the `research` skill. Pipeline: scout → query → synthesize → verify. Verify status: {{VERIFY_STATUS}}._
87
+ _Research pipeline: scout → query → synthesize → verify. Verify status: **{{VERIFY_STATUS}}**. {{N_PASS}} pass · {{N_STALE}} stale · {{N_FAIL}} fail._