npm - start-vibing - Versions diffs - 4.4.14 → 4.4.16 - Mend

start-vibing 4.4.14 → 4.4.16

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (7) hide show

package/dist/cli.js +248 -55
package/package.json +42 -42
package/template/.claude/agents/research-query.md +85 -45
package/template/.claude/agents/research-scout.md +48 -42
package/template/.claude/agents/research-synthesize.md +79 -76
package/template/.claude/skills/research/SKILL.md +124 -112
package/template/.claude/skills/research/templates/research.md.tpl +36 -66

package/template/.claude/skills/research/SKILL.md CHANGED Viewed

@@ -1,46 +1,33 @@
 ---
 name: research
-version: 0.1.0
+version: 0.2.0
 description: >
-    Performs Baymard-Institute-grade research on any topic the user asks about
-    (UX patterns, library evaluation, market analysis, academic literature, API
-    integration, architectural decisions). MUST BE USED when the user mentions
-    research, investigate, find info, search for, look up, pesquisar, pesquisa,
-    pesquise, investigar, or asks to evaluate / compare / understand any
-    technology, framework, vendor, methodology, or domain. Source-first
-    pipeline: scout → query → synthesize → verify. Output goes to
-    /docs/research/<topic>.md with URL+QUOTE+ACCESSED-AT+VERIFY-METHOD evidence
-    per claim. Re-uses cached research when fresh, calibrated by content-type.
+    Performs source-first research on any topic the user asks about (UX patterns,
+    library evaluation, market analysis, academic literature, API integration,
+    architectural decisions). MUST BE USED when the user mentions research,
+    investigate, find info, search for, look up, pesquisar, pesquisa, pesquise,
+    investigar, or asks to evaluate / compare / understand any technology,
+    framework, vendor, methodology, or domain. Four-agent pipeline: scout →
+    query (parallel fan-out) → synthesize → verify. Output is a developer-readable
+    engineering-blog briefing at /docs/research/<topic>.md — TL;DR-first, bolded-bullet
+    findings, embedded hyperlink citations. Every claim ships URL+QUOTE+ACCESSED-AT+VERIFY-METHOD
+    evidence. Re-uses cached research when fresh, calibrated by content-type.
 ---
-# research — evidence-backed knowledge production
+# research — evidence-backed knowledge production for developers
-> **Operating principle**: every claim in research output must be defensible
-> to a skeptical stakeholder. URL resolves, quote is in source, source is
-> independent. Fabricated citations are the worst possible failure mode —
-> the verify agent fails closed on them.
+> **Operating principle**: every claim in research output must be defensible to a skeptical engineer. URL resolves, quote is in source, source is independent. Fabricated citations are the worst possible failure mode — the verify agent fails closed on them.
+>
+> **Output principle**: the deliverable is an engineering-blog briefing, not an academic paper. Lead with the verdict. Use bolded bullets, not numbered subsections with paragraphs. Embed hyperlinks in prose. No ontology maps, no SKOS, no 30-item concept frontmatter. Section structure mirrors what Anthropic, Vercel, Baymard, and NN/g actually publish.
 ## What this skill does
 Four-phase pipeline with 4 specialist agents:
-1. **Scout** (research-scout, Haiku) — decomposes the user's question, checks
-   `/docs/research/` for existing fresh findings (content-type-calibrated
-   freshness — fast/medium/slow/permanent buckets), proposes a scoped research
-   plan + estimated query budget. Hands directly to Query — NO confirmation
-   gate (auto-proceed). User can interrupt mid-run if needed.
-2. **Query** (research-query, Sonnet) — executes web/library searches in
-   parallel, fetches pages via WebFetch + context7 (for library docs),
-   extracts atomic claims with URL+QUOTE+ACCESSED-AT evidence, dumps to
-   `claims.jsonl`.
-3. **Synthesize** (research-synthesize, Sonnet) — builds a lightweight
-   SKOS-adapted ontology, triangulates each claim across ≥3 independent
-   sources (Denzin's 4 types, not just count), produces final
-   `/docs/research/<topic>.md` and updates `index.md`.
-4. **Verify** (research-verify, Haiku) — anti-hallucination gate. For every
-   citation in the final doc: resolves URL, greps the literal quote, checks
-   DOI via Crossref. Fails closed on any unverified citation. Writes
-   `verify.json` with per-citation status.
+1. **Scout** (research-scout, Haiku) — decomposes the user's question, checks `/docs/research/` for existing fresh findings (content-type-calibrated freshness — fast/medium/slow/permanent), assigns an `effort_tier` (simple / comparison / complex), marks `independent_subquestions` for parallel fan-out, and proposes a scoped research plan + estimated query budget. Hands directly to Query — NO confirmation gate (auto-proceed).
+2. **Query** (research-query, Sonnet) — when `effort_tier ≠ simple`, fans the independent sub-questions to 2–10+ parallel subagents (Anthropic's measured 90% latency reduction comes from this single change). Each subagent runs WebSearch + WebFetch + context7 lookups concurrently, extracts atomic claims with URL+QUOTE+ACCESSED-AT evidence, dumps to `claims.jsonl`. Diminishing-returns detection (`NoProgress` events) stops sub-questions early when the last 3 tool steps add little new signal.
+3. **Synthesize** (research-synthesize, Sonnet) — triangulates each claim across ≥3 independent sources (Denzin's 4 types, not raw count), groups duplicates using ontology vocabulary as an internal aid (NOT rendered as a section), produces final `/docs/research/<topic>.md` from `templates/research.md.tpl` in engineering-blog format. Updates `index.md` and any MOCs.
+4. **Verify** (research-verify, Haiku) — anti-hallucination gate. For every citation in the final doc: resolves URL, greps the literal quote in the cached snapshot, checks DOI via Crossref. Fails closed on any unverified citation. Writes `verify.json` with per-citation status.
 ## Entry flow
@@ -48,8 +35,8 @@ Four-phase pipeline with 4 specialist agents:
 ```bash
 TOPIC_SLUG=$(echo "$USER_QUESTION" | bash .claude/skills/research/scripts/check-cache.sh --slugify)
-SESSION_DIR="docs/research/.cache/sessions/$(date +%Y-%m-%d-%H%M%S)-$TOPIC_SLUG"
-mkdir -p "$SESSION_DIR"
+SESSION_DIR="docs/research/.cache/sessions/$(date +%Y%m%d%H%M%S)-$TOPIC_SLUG"
+mkdir -p "$SESSION_DIR/snapshots"
 bash .claude/skills/research/scripts/check-cache.sh \
   --topic "$TOPIC_SLUG" \
@@ -57,104 +44,112 @@ bash .claude/skills/research/scripts/check-cache.sh \
   > "$SESSION_DIR/cache-check.json"
 ```
-`cache-check.json` reports: existing doc path (if any), age in days,
-content-type bucket (fast/medium/slow/permanent), freshness verdict
-(fresh / aging / stale / outdated), recommended action
-(reuse | delta-update | full-research).
+`cache-check.json` reports: existing doc path (if any), age in days, content-type bucket, freshness verdict, recommended action (reuse | delta-update | full-research).
-If verdict is `reuse`, return the existing doc path to the user and exit. Do
-not burn query tokens on a cache hit.
+If verdict is `reuse`, return the existing doc path to the user and exit. Do not burn query tokens on a cache hit.
 ### Step 2 — Scout (Task tool → research-scout)
-Pass `cache-check.json` + the user question. Scout returns
-`scout-plan.json`:
+Pass `cache-check.json` + the user question. Scout returns `scout-plan.json`:
 ```jsonc
 {
 	"topic_slug": "react-server-components-data-fetching",
 	"question": "...",
 	"decomposition": ["sub-q1", "sub-q2", "..."],
+	"independent_subquestions": [0, 1, 2], // drives parallel fan-out
 	"domain": "software-engineering",
-	"playbook": "library-evaluation", // from references/domain-playbooks.md
-	"content_type_bucket": "fast", // fast | medium | slow | permanent
-	"estimated_queries": 12,
+	"playbook": "library-evaluation",
+	"content_type_bucket": "fast",
+	"effort_tier": "comparison", // simple | comparison | complex
+	"estimated_queries": 14,
 	"estimated_minutes": 8,
-	"cache_strategy": "delta-update", // from cache-check.json
+	"cache_strategy": "delta-update",
+	"decision_required": true,
 	"blockers": [],
 }
 ```
 ### Step 3 — Auto-proceed (no confirmation gate)
-Print a ≤4-line summary for visibility, then immediately dispatch Query.
-Do NOT wait for user confirmation. The user can `/cancel` or interrupt
-mid-run if scope is wrong.
+Print a ≤4-line summary for visibility, then immediately dispatch Query. Do NOT wait for user confirmation. The user can interrupt mid-run if scope is wrong.
 ```
-Topic: <slug> · Plan: <N> sub-questions, ~<Q> queries, ~<M> min
-Cache: <strategy> · Proceeding to query...
+Topic: <slug> · Tier: <effort_tier> · Plan: <N> sub-questions, ~<Q> queries, ~<M> min
+Cache: <strategy> · Fan-out: <K> parallel subagents · Proceeding to query...
 ```
 Exception: if `--dry-run` flag is set, stop here and return scout-plan.json.
 ### Step 4 — Query (Task tool → research-query)
-Dispatch with `scout-plan.json`. Agent runs parallel searches, writes
-`claims.jsonl` (one atomic claim per line) and `sources.jsonl` (one source
-per line, with accessed-at timestamp). Each claim references its source by
-ID. Hard rule: every claim has at least one verbatim quote from its source.
+Dispatch with `scout-plan.json`. Agent picks execution shape from `effort_tier`:
+| `effort_tier` | Pattern                         | Concurrency | Tool calls per sub-q |
+| ------------- | ------------------------------- | ----------- | -------------------- |
+| `simple`      | Single executor, no fan-out     | 1 agent     | 3–10                 |
+| `comparison`  | Lead + 2–4 parallel subagents   | 2–4         | 10–15 each           |
+| `complex`     | Lead + 5–10+ parallel subagents | 5–10+       | varies               |
+Anthropic's verbatim heuristic: _"Simple fact-finding requires just 1 agent with 3-10 tool calls, direct comparisons might need 2-4 subagents with 10-15 calls each, and complex research might use more than 10 subagents"_ ([Anthropic Engineering](https://www.anthropic.com/engineering/multi-agent-research-system)).
+Why this matters for quality, not just speed: same post — _"token usage by itself explains 80% of the variance, with the number of tool calls and the model choice as the two other explanatory factors"_. Parallelization is a quality lever.
+Agent writes `claims.jsonl` (one atomic claim per line) and `sources.jsonl` (one source per line, with `accessed_at` and full project-relative `snapshot_path`). Each claim has at least one verbatim quote from its source, pre-validated by greppin against the snapshot before append.
 ### Step 5 — Synthesize (Task tool → research-synthesize)
 Dispatch with `claims.jsonl` + `sources.jsonl`. Agent:
-1. Builds ontology from `references/ontology-patterns.md` vocabulary.
-2. Triangulates: groups claims by assertion, requires ≥3 INDEPENDENT
-   sources (Denzin types, not just count) for high-confidence claims.
-3. Renders `/docs/research/<topic-slug>.md` from
-   `templates/research.md.tpl`.
-4. Writes ADR to `/docs/research/decisions/NNNN-<slug>.md` if the
-   user's question implies a decision.
-5. Calls `scripts/update-index.sh` to regenerate `/docs/research/index.md`
-   and any MOCs.
+1. Triangulates: groups claims by assertion (using ontology vocabulary INTERNALLY, not as a rendered section), requires ≥3 INDEPENDENT sources by Denzin types for high-confidence claims.
+2. Renders `/docs/research/<topic-slug>.md` from `templates/research.md.tpl` in engineering-blog format:
+    - Frontmatter (minimal — `concepts:` cap at 8)
+    - **TL;DR** with verdict-first lead + 5–7 bolded-verdict bullets
+    - **Why this matters** (2–4 prose paragraphs)
+    - **What we found** (flat bolded bullets, ONE LINE EACH — not numbered subsections + paragraphs + blockquotes)
+    - **Where the evidence disagrees** (only if disagreements exist)
+    - **Trade-offs** (replaces DO/AVOID; honest single section)
+    - **Open questions** (only if any)
+    - **Sources** table at bottom
+3. Citation style: embedded hyperlinks in prose by default. Numeric `[1]` only on explicit user request or 4+ source overflow.
+4. Writes ADR to `/docs/research/decisions/NNNN-<slug>.md` if `decision_required`.
+5. Calls `scripts/update-index.sh` to regenerate `/docs/research/index.md` and any MOCs.
+DROPPED from the v0.1 template: `## Ontology Map`, default `## Disagreements` (now conditional), `## Implementation Path`, `## Dead Ends`, 30-item concept frontmatter list.
 ### Step 6 — Verify (Task tool → research-verify)
-Dispatch with the rendered doc. Agent runs
-`scripts/verify-citations.sh <doc>` which:
+Dispatch with the rendered doc. Agent runs `scripts/verify-citations.sh <doc> <session_dir>` which:
-- For each `Source` row → fetches URL, checks HTTP 200, greps the
-  associated quote.
+- For each `Source` row → fetches URL, checks HTTP 200, greps the associated quote against `snapshot_path`.
 - For DOIs → hits Crossref API.
 - Writes `verify.json` to the session dir.
 - Returns non-zero on any failed citation.
-If verify fails, the synthesize agent is re-dispatched with the failure
-report to fix or remove unverifiable claims. Three failed verify rounds →
-abort and surface findings to the user.
+If verify fails, the synthesize agent is re-dispatched with the failure report to fix or remove unverifiable claims. Three failed verify rounds → abort and surface findings to the user.
 ### Step 7 — Persist + summarize
 ```bash
 bash .claude/skills/research/scripts/update-index.sh
-echo "$TOPIC_SLUG $(date -u +%Y-%m-%dT%H:%M:%SZ) $VERIFY_STATUS" \
+echo "{\"topic\":\"$TOPIC_SLUG\",\"timestamp\":\"$(date -u +%Y-%m-%dT%H:%M:%SZ)\",\"verify_status\":\"$VERDICT\",\"pass\":$N_PASS,\"stale\":$N_STALE,\"fail\":$N_FAIL}" \
   >> docs/research/.research-state.jsonl
 ```
-Return ≤5 sentences to the user: doc path, claim count, sources cited,
-confidence levels, open questions count. Do NOT paste the doc body.
+Return ≤5 sentences to the user: doc path, finding count, sources cited, confidence levels, open questions count. Do NOT paste the doc body.
 ## User flags
 - `--force-fresh` — ignore cache, full research even if fresh exists
 - `--delta-only` — only update sections that changed since the cached version
 - `--scope <bucket>` — narrow content-type bucket (fast | medium | slow | permanent)
-- `--playbook <name>` — override playbook detection (from `references/domain-playbooks.md`)
+- `--playbook <name>` — override playbook detection
+- `--effort <tier>` — override effort tier (simple | comparison | complex)
 - `--no-verify` — skip verify gate (NOT recommended; only for offline runs)
 - `--lang <code>` — output language (default: `en`; accepts `pt`, `es`, etc.)
-- `--max-queries <N>` — cap total queries (default 20)
+- `--max-queries <N>` — cap total queries (overrides scout estimate)
 - `--dry-run` — produce scout-plan.json then stop
+- `--cite-style <style>` — `inline-hyperlink` (default) | `numeric-footnote`
 ## Output layout
@@ -174,14 +169,14 @@ docs/research/
         ├── scout-plan.json
         ├── claims.jsonl
         ├── sources.jsonl
+        ├── progress.log              # NoProgress events from query
         ├── verify.json
-        └── snapshots/<n>.html        # WebFetched page caches for grep
+        └── snapshots/<n>.md          # WebFetched page caches for grep
 ```
 ## Evidence protocol — URL+QUOTE+ACCESSED-AT+VERIFY-METHOD
-Adapted from super-design's SHOT+QUOTE+SEL+VAL. Every non-meta claim in the
-output ships:
+Every non-meta claim in the output ships:
 | Field             | Meaning                                                          |
 | ----------------- | ---------------------------------------------------------------- |
@@ -190,8 +185,7 @@ output ships:
 | **ACCESSED-AT**   | UTC ISO-8601 timestamp of the fetch                              |
 | **VERIFY-METHOD** | `web-fetch` / `crossref-api` / `screenshot` / `archive-snapshot` |
-`scripts/verify-citations.sh` enforces this contract. Coverage-gap and
-"open question" findings are exempt (no claim to verify).
+`scripts/verify-citations.sh` enforces this contract. Coverage-gap and "open question" findings are exempt (no claim to verify).
 ## Freshness — content-type buckets (NOT one-size)
@@ -206,46 +200,62 @@ Bucket detection lives in `scripts/check-cache.sh`. Override with `--scope`.
 ## Triangulation — Denzin's 4 types, not "3 sources"
-Per `references/research-methodology.md` §4. A claim achieves
-**high-confidence** only when it survives ≥3 INDEPENDENT sources where
-"independent" means satisfying ≥1 of:
+A claim achieves **high-confidence** only when it survives ≥3 INDEPENDENT sources where "independent" means satisfying ≥1 of:
 - **Data triangulation** — different time/place/persons
-- **Investigator triangulation** — different authors with no shared
-  funding/employer
-- **Theoretical triangulation** — different theoretical framings reach
-  the same conclusion
-- **Methodological triangulation** — different methods (survey vs
-  interview vs telemetry) converge
+- **Investigator triangulation** — different authors with no shared funding/employer
+- **Theoretical triangulation** — different theoretical framings reach the same conclusion
+- **Methodological triangulation** — different methods (survey vs interview vs telemetry) converge
+Republication chains and citation cascades count as **one** source. The verify gate flags suspected republication via shared DOM fingerprints + ownership trees.
+**Render rule (synthesize)**: confidence is rendered as a parenthetical next to each finding bullet — `_[high — Anthropic + Vercel + NN/g]_`. NOT as a separate methodology box, NOT as an Ontology Map, NOT as a triangulation matrix.
+## Diminishing-returns detection (research-query)
+After every batch of fetches, query agent evaluates whether the last 3 tool steps produced new signal:
+- New non-boilerplate tokens added to claims.jsonl in last 3 steps < 500
+- Tool work in last 3 steps < $0.01
+If both below threshold → emit `NoProgress` event to `$SESSION_DIR/progress.log`. After **2 consecutive `NoProgress` events** on a sub-question → terminate that sub-question. After **4 consecutive** across the run → terminate research entirely and hand off to synthesize.
+Thresholds are illustrative — calibrate per project.
+## Length scales with `effort_tier`
+| `effort_tier` | Target lines | Sections expected                           |
+| ------------- | ------------ | ------------------------------------------- |
+| `simple`      | 80–150       | TL;DR · What we found · Sources             |
+| `comparison`  | 150–280      | + Why this matters · Trade-offs             |
+| `complex`     | 280–450      | + Where evidence disagrees · Open questions |
-Republication chains and citation cascades count as **one** source.
-`scripts/verify-citations.sh` flags suspected republication via shared DOM
-fingerprints + ownership trees.
+Going under target = not enough info; over target = reader bails. Anchor on these.
 ## Scripts (`.claude/skills/research/scripts/`)
 | Script                | Purpose                                                                                              |
 | --------------------- | ---------------------------------------------------------------------------------------------------- |
 | `check-cache.sh`      | Slugify topic, scan `/docs/research/`, classify content-type bucket, return reuse/delta/full verdict |
-| `verify-citations.sh` | Per citation: HTTP 200, quote grep, DOI Crossref check, write verify.json                            |
+| `verify-citations.sh` | Per citation: HTTP 200, quote grep against `snapshot_path`, DOI Crossref check, write verify.json    |
 | `dedup-research.sh`   | Detect overlap between docs (jaccard on concept lists + citation overlap), suggest merge             |
 | `update-index.sh`     | Regenerate `/docs/research/index.md` + per-folder indexes from frontmatter                           |
 | `extract-claims.py`   | Pull atomic claims with citations from a rendered doc into JSONL                                     |
 ## Templates (`.claude/skills/research/templates/`)
-| Template                     | Output                                      |
-| ---------------------------- | ------------------------------------------- |
-| `research.md.tpl`            | Main `/docs/research/<slug>.md` deliverable |
-| `adr.md.tpl`                 | Nygard ADR for decision questions           |
-| `moc.md.tpl`                 | Map of Content for cross-topic themes       |
-| `index.md.tpl`               | TOC for `/docs/research/index.md`           |
-| `research-state.schema.json` | Schema for state JSONL entries              |
+| Template                     | Output                                                             |
+| ---------------------------- | ------------------------------------------------------------------ |
+| `research.md.tpl`            | Main `/docs/research/<slug>.md` — engineering-blog briefing format |
+| `adr.md.tpl`                 | Nygard ADR for decision questions                                  |
+| `moc.md.tpl`                 | Map of Content for cross-topic themes                              |
+| `index.md.tpl`               | TOC for `/docs/research/index.md`                                  |
+| `research-state.schema.json` | Schema for state JSONL entries                                     |
 ## References (read on demand)
-- `references/research-methodology.md` — the 15-topic deep methodology bible
-- `references/ontology-patterns.md` — SKOS-adapted relationship vocabulary + LLM-friendly ontology examples
+- `references/research-methodology.md` — methodology (triangulation, freshness, query engineering)
+- `references/ontology-patterns.md` — INTERNAL grouping vocabulary for synthesize (NOT rendered as a section)
 - `references/source-directory.md` — per-domain authoritative sources, authority hierarchies, AI-content red flags
 - `references/domain-playbooks.md` — step-by-step protocols per research domain (UX, library eval, API, ADR, market, academic, news, security, pricing)
@@ -254,13 +264,20 @@ fingerprints + ownership trees.
 1. **Cache first**. Never burn query tokens on a fresh cache hit.
 2. **Auto-proceed after scout**. Print summary, immediately dispatch query. NO confirmation gate. User can interrupt mid-run.
 3. **Every claim has URL+QUOTE+ACCESSED-AT+VERIFY-METHOD**. Verify gate fails closed on violations.
-4. **No fabricated citations, ever**. If a quote cannot be greppped in the fetched page, the claim is dropped.
+4. **No fabricated citations, ever**. If a quote cannot be greppded in the fetched page, the claim is dropped.
 5. **Triangulate by Denzin type, not raw count**. 3 republications of one wire story = 1 source.
 6. **Content-type freshness**. Don't apply fast-bucket aging to slow-bucket topics or vice versa.
 7. **Output to `/docs/research/`** — never to `.claude/skills/research-cache/` (legacy).
 8. **English output by default** — even when triggered in Portuguese. Override with `--lang pt`.
 9. **Summary to user ≤5 sentences**. Doc body lives in the file.
 10. **Skill ⊥ super-design ⊥ e2e-audit**. If the user asked for a UX audit or test audit, hand off — do not improvise.
+11. **No Ontology Map section in the rendered doc.** Ontology vocabulary is internal-only.
+12. **No 30+ concept frontmatter list.** Cap at 8.
+13. **Findings render as flat bolded bullets**, not numbered subsections with paragraphs + blockquotes.
+14. **Citations default to embedded hyperlinks in prose.** `[Anthropic Engineering](URL)` style. Numeric `[1]` is opt-in via `--cite-style numeric-footnote`.
+15. **Parallel fan-out is the norm** for `comparison` and `complex` tiers. Serial execution only for `simple`.
+16. **Honor diminishing-returns detection.** 2 consecutive NoProgress per sub-q → stop that sub-q; 4 across the run → stop research.
+17. **Length scales with `effort_tier`.** Don't pad, don't truncate.
 ## Boundaries (what this skill does NOT do)
@@ -272,13 +289,8 @@ fingerprints + ownership trees.
 ## Invocation triggers (enforced by SessionStart hook)
-EN: `research`, `investigate`, `find info`, `search for`, `look up`,
-`evaluate`, `compare`, `audit literature`, `competitor analysis`,
-`market research`, `library evaluation`, `prior art`.
+EN: `research`, `investigate`, `find info`, `search for`, `look up`, `evaluate`, `compare`, `audit literature`, `competitor analysis`, `market research`, `library evaluation`, `prior art`.
-PT: `pesquisar`, `pesquisa`, `pesquise`, `investigar`, `buscar info`,
-`procurar info`, `comparar`, `avaliar biblioteca`, `análise de
-mercado`, `análise de concorrentes`.
+PT: `pesquisar`, `pesquisa`, `pesquise`, `investigar`, `buscar info`, `procurar info`, `comparar`, `avaliar biblioteca`, `análise de mercado`, `análise de concorrentes`.
-The hook injects this context at session start. Claude must read this
-SKILL.md before improvising a research plan.
+The hook injects this context at session start. Claude must read this SKILL.md before improvising a research plan.

package/template/.claude/skills/research/templates/research.md.tpl CHANGED Viewed

@@ -7,111 +7,81 @@ content_type_bucket: "{{BUCKET}}"            # fast | medium | slow | permanent
 freshness: "{{FRESHNESS}}"                    # fresh | aging | stale | outdated
 freshness_window_days: {{WINDOW_DAYS}}
 playbook: "{{PLAYBOOK}}"
-domain: "{{DOMAIN}}"
 sources_count: {{SOURCES_COUNT}}
 findings_count: {{FINDINGS_COUNT}}
-disagreements_count: {{DISAGREEMENTS_COUNT}}
-open_questions_count: {{OPEN_Q_COUNT}}
 confidence_summary: "{{CONFIDENCE_SUMMARY}}"  # e.g. "5 high · 3 medium · 1 low"
-concepts:
-{{CONCEPTS_YAML_LIST}}
-session_id: "{{SESSION_ID}}"
+concepts:                                     # CAP at 8 — primary search keywords only
+{{CONCEPTS_YAML_LIST_MAX_8}}
 ---
-# Research: {{TITLE}}
+# {{TITLE}}
 > Bucket: **{{BUCKET}}** · Status: **{{FRESHNESS}}** · Confidence: {{CONFIDENCE_SUMMARY}}
 > Session: `docs/research/.cache/sessions/{{SESSION_ID}}/`
-## Executive summary
-{{EXEC_SUMMARY}}
-## Question
-{{ORIGINAL_QUESTION}}
-## Ontology Map
+---
-Concepts and their relationships using the SKOS-adapted vocabulary
-(see `.claude/skills/research/references/ontology-patterns.md`).
+## TL;DR — {{TLDR_HEADLINE}}
-```
-{{ONTOLOGY_RELATIONS}}
-```
+{{TLDR_LEAD_PARAGRAPH_1_TO_3_SENTENCES}}
-## Findings
+{{#each TLDR_BULLET}}
+{{N}}. **{{VERDICT_PHRASE}}.** {{ONE_SENTENCE_RATIONALE}} ({{INLINE_CITATION}})
+{{/each}}
-{{#each FINDING}}
-### Finding {{ID}} — {{TITLE}}
+---
-{{ASSERTION}}
+## Why this matters
-**Confidence:** {{CONFIDENCE}}{{#if FRESHNESS_WARNING}} · _Freshness warning: {{FRESHNESS_WARNING}}_{{/if}}{{#if TRIANGULATION_WARNING}} · _Triangulation: {{TRIANGULATION_WARNING}}_{{/if}}
+{{CONTEXT_2_TO_4_PARAGRAPHS — ground the reader in the problem the research actually addresses; cite the originating constraint or pain point. Keep it engineering-blog tone. NO methodology box, NO triangulation discussion here.}}
-Evidence:
+---
-{{#each EVIDENCE}}
-> "{{QUOTE}}" [{{SOURCE_ID}}]
-> URL: {{URL}}
-> Accessed: {{ACCESSED_AT}}
-> Verify: {{VERIFY_METHOD}}
-{{/each}}
+## What we found
+{{#each FINDING}}
+- **{{ASSERTION_AS_VERDICT}}** — {{ONE_OR_TWO_SENTENCE_EVIDENCE_SUMMARY_WITH_INLINE_HYPERLINK}}. _[{{CONFIDENCE}} — {{TRIANGULATION_TAGS}}]_
 {{/each}}
-## Disagreements
-{{#each DISAGREEMENT}}
-### {{TOPIC}}
-- **Position A** ([{{SRC_A}}]): {{POSITION_A}}
-- **Position B** ([{{SRC_B}}]): {{POSITION_B}}
-- **Resolution requires:** {{RESOLUTION_HINT}}
-{{/each}}
+> Use bolded-bullet flat format. Do NOT use the heavy "### Finding N" / paragraph / blockquote / confidence-label pattern.
+> Inline citation style: embedded hyperlink in the prose — `[Anthropic Engineering](URL)` — NOT bracketed numerics. Reserve `[1]` numerics only when the user explicitly requested footnote style.
-## Recommendations
+{{#if DISAGREEMENTS_EXIST}}
+---
-### DO
+## Where the evidence disagrees
-{{#each RECOMMENDATION_DO}}
-- {{TEXT}} — _{{REASON}}_
+{{#each DISAGREEMENT}}
+- **{{TOPIC}}**: [{{SRC_A_LABEL}}]({{SRC_A_URL}}) says "{{POSITION_A}}". [{{SRC_B_LABEL}}]({{SRC_B_URL}}) says "{{POSITION_B}}". Resolution would require {{RESOLUTION_HINT}}.
 {{/each}}
+{{/if}}
-### AVOID
+---
-{{#each RECOMMENDATION_AVOID}}
-- {{TEXT}} — _{{REASON}}_
-{{/each}}
+## Trade-offs
-## Implementation Path
+{{TRADE_OFFS_2_TO_5_BULLETS — what the recommended approach gives up, NOT a separate "AVOID" list. Frame as "Choosing X means losing Y." Each bullet cites at least one source.}}
-{{#each STEP}}
-{{N}}. {{TEXT}}
-{{/each}}
+{{#if OPEN_QUESTIONS_EXIST}}
+---
-## Open Questions
+## Open questions
 {{#each OPEN_Q}}
 - {{TEXT}}
 {{/each}}
+{{/if}}
-## Dead Ends
-_Searched but not found / not applicable_
-{{#each DEAD_END}}
-- {{TEXT}}
-{{/each}}
+---
 ## Sources
-| ID | Title | Publisher | Authority (1-5) | Independence | Accessed | URL |
-|----|-------|-----------|-----------------|--------------|----------|-----|
+| ID | Title | Publisher | Authority | Independence | Accessed |
+|----|-------|-----------|-----------|--------------|----------|
 {{#each SOURCE}}
-| {{ID}} | {{TITLE}} | {{PUBLISHER}} | {{AUTHORITY_LEVEL}} | {{INDEPENDENCE}} | {{ACCESSED_AT}} | {{URL}} |
+| {{ID}} | [{{TITLE}}]({{URL}}) | {{PUBLISHER}} | {{AUTHORITY_LEVEL}}/5 | {{INDEPENDENCE}} | {{ACCESSED_AT}} |
 {{/each}}
 ---
-_Generated by the `research` skill. Pipeline: scout → query → synthesize → verify. Verify status: {{VERIFY_STATUS}}._
+_Research pipeline: scout → query → synthesize → verify. Verify status: **{{VERIFY_STATUS}}**. {{N_PASS}} pass · {{N_STALE}} stale · {{N_FAIL}} fail._