@faviovazquez/deliberate 0.2.1 → 0.2.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/BRAINSTORM.md CHANGED
@@ -14,6 +14,16 @@ Brainstorm mode is not a stripped-down deliberation. It's a separate process opt
14
14
  /deliberate --brainstorm --members assumption-breaker,reframer,pragmatic-builder "new pricing model"
15
15
  /deliberate --brainstorm --triad product "what features should v2 include?"
16
16
  /deliberate --brainstorm --visual "landing page redesign"
17
+ /deliberate --brainstorm --research "what are the best approaches to financial data lineage?"
18
+ /deliberate --brainstorm --research=code "how should we refactor this module?"
19
+ /deliberate --brainstorm --research=web "new data governance frameworks we should consider"
20
+ ```
21
+
22
+ On Windsurf:
23
+ ```
24
+ @deliberate brainstorm with research: what are the best approaches to financial data lineage?
25
+ @deliberate brainstorm look at the codebase first: how should we refactor the ingestion pipeline?
26
+ @deliberate brainstorm search the web: new data governance frameworks we should consider
17
27
  ```
18
28
 
19
29
  ## HARD GATE
@@ -35,6 +45,25 @@ Read the relevant context before asking any questions:
35
45
 
36
46
  Produce a brief context summary (3-5 sentences) so the user can confirm you understand the space.
37
47
 
48
+ #### Research Grounding (optional — `--research` flag only)
49
+
50
+ If `--research`, `--research=web`, or `--research=code` is set (or natural language on Windsurf: "with research", "search the web", "look at the codebase first", "do research first"):
51
+
52
+ 1. Read `protocols/research-grounding.md` for the full protocol
53
+ 2. Execute Steps R1–R4 **before** asking clarifying questions in Phase 3
54
+ 3. Include the Codebase Context Summary and/or Web Research Summary in the context summary presented to the user
55
+ 4. Inject grounding evidence into every agent prompt in Phase 5 (Divergent Phase)
56
+
57
+ The grounding evidence supplements — but does not replace — the clarifying questions in Phase 3. Agents still receive the clarified scope from Phase 3 alongside the research findings.
58
+
59
+ Announce to the user:
60
+ ```
61
+ Research mode active. I'll scan [the codebase / the web / both] before we start.
62
+ This gives the agents real evidence to reason from, not just parametric knowledge.
63
+ ```
64
+
65
+ If `--research` is NOT set, do not spontaneously research. Proceed with standard context exploration only.
66
+
38
67
  ### Phase 2: Visual Companion Offer
39
68
 
40
69
  If this brainstorm is likely to involve visual content (UI, architecture diagrams, mockups, design comparisons), offer the visual companion:
package/CHANGELOG.md CHANGED
@@ -5,6 +5,24 @@ All notable changes to this project will be documented in this file.
5
5
  The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
6
6
  and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
7
 
8
+ ## [0.2.4] - 2025-04-03
9
+
10
+ ### Added
11
+ - `assets/research_grounding.png`: diagram explaining the `--research` grounding flow — pipeline stages R1–R5, codebase scan vs web search breakdown, without/with contrast panel
12
+ - Embedded `research_grounding.png` in README at the `--research` section
13
+
14
+ ## [0.2.3] - 2025-04-03
15
+
16
+ ### Added
17
+ - `--research` flag: opt-in grounding phase before Round 1 — scans codebase and/or searches the web before agents analyze
18
+ - `--research=web`: web search only variant
19
+ - `--research=code`: codebase scan only variant
20
+ - `protocols/research-grounding.md`: full grounding protocol (search strategy, codebase scan levels, scope limits, evidence quality rules)
21
+ - Research grounding integrated into SKILL.md coordinator sequence (Step 0), Quick mode, Duo mode
22
+ - Research grounding integrated into BRAINSTORM.md Phase 1
23
+ - Natural language Windsurf triggers: "with research", "search the web", "look at the codebase first", "do research first"
24
+ - README: Research Grounding section with dual-platform examples, grounding phase description, evidence rules
25
+
8
26
  ## [0.2.1] - 2025-04-03
9
27
 
10
28
  ### Added
package/README.md CHANGED
@@ -282,8 +282,75 @@ Pre-defined agent groups for common scenarios. Use when you don't want to pick i
282
282
  | `--profile {name}` | Use named profile (full, lean, exploration, execution) | `/deliberate --profile exploration "question"` |
283
283
  | `--visual` | Launch browser-based visual companion | `/deliberate --visual --full "question"` |
284
284
  | `--save {slug}` | Override auto-generated filename slug for output | `/deliberate --save my-decision "question"` |
285
+ | `--research` | Grounding phase before Round 1: scan codebase + search web. Agents reason from retrieved evidence. **Opt-in only.** | `/deliberate --research "question"` |
286
+ | `--research=web` | Web search only (no codebase scan) | `/deliberate --research=web "question"` |
287
+ | `--research=code` | Codebase scan only (no web search) | `/deliberate --research=code "question"` |
285
288
 
286
- > **Windsurf/Cursor note:** Flags are expressed as natural language after `@deliberate`. The skill parses intent from your message. Examples: `@deliberate full deliberation: ...`, `@deliberate quick: ...`, `@deliberate architecture triad: ...`, `@deliberate brainstorm: ...`.
289
+ > **Windsurf/Cursor note:** Flags are expressed as natural language after `@deliberate`. The skill parses intent from your message. Examples: `@deliberate full deliberation: ...`, `@deliberate quick: ...`, `@deliberate architecture triad: ...`, `@deliberate brainstorm: ...`, `@deliberate with research: ...`, `@deliberate search the web before answering: ...`, `@deliberate look at the codebase first: ...`.
290
+
291
+ ---
292
+
293
+ ## Research Grounding (`--research`)
294
+
295
+ By default, agents reason from **parametric knowledge** — what the model internalized during training. This works well for timeless architectural reasoning. It breaks down for:
296
+
297
+ - **Your codebase**: agents don't know what's in it unless they read it
298
+ - **Recent events**: model cutoffs miss new library releases, regulatory changes, pricing shifts, CVEs
299
+ - **Current benchmarks**: performance comparisons from 18 months ago may be reversed today
300
+ - **Competitor landscape**: what Bloomberg, Palantir, or a new startup shipped last quarter
301
+
302
+ `--research` adds a **grounding phase before Round 1**. The coordinator scans your codebase and/or searches the web, packages the findings into a Codebase Context Summary and Web Research Summary, and injects this evidence into every agent's prompt. Agents then reason from real, retrieved facts — and are required to cite what they found and flag what was missing.
303
+
304
+ **This is opt-in only. It is never the default.** Not every question needs research — and research consumes tokens. Use it when parametric knowledge is insufficient.
305
+
306
+ <p align="center">
307
+ <img src="assets/research_grounding.png" alt="Research Grounding — --research flag" width="100%" />
308
+ </p>
309
+
310
+ ### Research modes
311
+
312
+ **Claude Code:**
313
+ ```
314
+ /deliberate --research "should we adopt Apache Iceberg for our data lake?"
315
+ /deliberate --research=web "what are the current alternatives to dbt for data transformation?"
316
+ /deliberate --research=code "is our current ingestion pipeline a good candidate for refactoring?"
317
+ /deliberate --triad architecture --research "should we migrate our REST APIs to GraphQL?"
318
+ /deliberate --brainstorm --research "what data quality monitoring approaches should we consider?"
319
+ ```
320
+
321
+ **Windsurf / Cursor:**
322
+ ```
323
+ @deliberate with research: should we adopt Apache Iceberg for our data lake?
324
+ @deliberate search the web before answering: what are the current alternatives to dbt for data transformation?
325
+ @deliberate look at the codebase first: is our ingestion pipeline a good candidate for refactoring?
326
+ @deliberate architecture triad with research: should we migrate our REST APIs to GraphQL?
327
+ @deliberate brainstorm do research first: what data quality monitoring approaches should we consider?
328
+ ```
329
+
330
+ ### What the grounding phase does
331
+
332
+ **Codebase scan** (activated by `--research` or `--research=code`):
333
+ 1. Reads `AGENTS.md`, project README, top-level structure
334
+ 2. Reads relevant entry points, dependency manifests, config files
335
+ 3. Follows specific files surfaced by the question domain (auth layers for security questions, hot paths for performance, etc.)
336
+ 4. Caps at **10 files maximum** — precision over volume
337
+ 5. Produces a **Codebase Context Summary** shared with all agents
338
+
339
+ **Web search** (activated by `--research` or `--research=web`):
340
+ 1. Runs **5 targeted searches maximum** — bad searches waste context
341
+ 2. Prioritizes recency (last 12 months) and primary sources (official docs, engineering blogs, CVE databases, academic papers)
342
+ 3. Produces a **Web Research Summary** with findings + source URLs + recency note
343
+ 4. `assumption-breaker` is explicitly tasked with scrutinizing source quality
344
+
345
+ ### Grounding evidence rules
346
+
347
+ Agents treat grounding evidence as **facts to reason from, not conclusions to accept**:
348
+ - They must cite specific findings ("the codebase currently uses library X at version Y, which has Z implication")
349
+ - They must flag gaps ("the web research didn't find benchmark data for >10M records/day — my analysis is therefore first-principles")
350
+ - `assumption-breaker` scrutinizes source quality and flags cherry-picked or outdated evidence
351
+ - Agents that over-rely on evidence get flagged in the coordinator's Round 2 dispatch
352
+
353
+ If the platform has no web search or file access tools, the coordinator announces the limitation and falls back gracefully to parametric knowledge.
287
354
 
288
355
  ---
289
356
 
package/SKILL.md CHANGED
@@ -35,6 +35,9 @@ description: "Multi-agent deliberation skill. Forces structured disagreement to
35
35
  | `--profile {name}` | Use named profile (full, lean, exploration, execution). |
36
36
  | `--visual` | Launch visual companion for this session. |
37
37
  | `--save {slug}` | Override auto-generated filename slug for output. |
38
+ | `--research` | Run a grounding phase before Round 1: scan the codebase and/or search the web. Agents reason from retrieved evidence, not parametric knowledge alone. **Opt-in only — never the default.** See `protocols/research-grounding.md`. |
39
+ | `--research=web` | Web search only (no codebase scan). |
40
+ | `--research=code` | Codebase scan only (no web search). |
38
41
 
39
42
  ## The 14 Core Agents
40
43
 
@@ -117,14 +120,27 @@ These agents are structural counterweights. When both are present, genuine disag
117
120
 
118
121
  The coordinator is the orchestration layer. It does NOT have its own opinion. It routes, enforces protocol, and synthesizes.
119
122
 
120
- ### Step 0: Platform Detection
123
+ ### Step 0: Research Grounding (optional — `--research` flag only)
124
+
125
+ If `--research`, `--research=web`, or `--research=code` is set (or natural language equivalents on Windsurf: "do research first", "search the web", "look at the codebase first", "with research"):
126
+
127
+ 1. Read the full protocol at `protocols/research-grounding.md`
128
+ 2. Execute Steps R1–R4 from that protocol before dispatching any agents
129
+ 3. Package the grounding context (Codebase Context Summary and/or Web Research Summary) for injection into all agent prompts in Step 5
130
+ 4. After Round 1, run Step R5 (evidence quality check)
131
+
132
+ If `--research` is NOT set, skip this step entirely. Do not spontaneously research unless the user explicitly requests it.
133
+
134
+ ---
135
+
136
+ ### Step 1: Platform Detection
121
137
 
122
138
  Read `configs/defaults.yaml` to determine:
123
139
  - **Claude Code**: Use parallel subagents. Each agent gets its own context window.
124
140
  - **Windsurf**: Use sequential role-prompting within single context. Default to "lean" profile unless user overrides.
125
141
  - **Other**: Fall back to sequential mode.
126
142
 
127
- ### Step 1: Problem Restatement
143
+ ### Step 2: Problem Restatement
128
144
 
129
145
  Before any agent speaks, the coordinator restates the user's question in neutral, precise terms. This prevents framing bias from the original question.
130
146
 
@@ -133,7 +149,7 @@ Before any agent speaks, the coordinator restates the user's question in neutral
133
149
  {Neutral restatement of the user's question, stripped of loaded language}
134
150
  ```
135
151
 
136
- ### Step 2: Agent Selection
152
+ ### Step 3: Agent Selection
137
153
 
138
154
  Based on flags:
139
155
  - `--full`: All 14 core agents
@@ -145,7 +161,7 @@ Based on flags:
145
161
 
146
162
  If auto-detection is ambiguous, present the top 2-3 triad matches and let the user choose.
147
163
 
148
- ### Step 3: Model Routing
164
+ ### Step 4: Model Routing
149
165
 
150
166
  Read `configs/defaults.yaml` for tier mapping:
151
167
  - Agents marked `model: high` get the high-tier model
@@ -154,7 +170,7 @@ Read `configs/defaults.yaml` for tier mapping:
154
170
 
155
171
  If `configs/provider-model-slots.yaml` exists, use manual overrides instead of auto-routing.
156
172
 
157
- ### Step 4: Visual Companion (optional)
173
+ ### Step 5: Visual Companion (optional)
158
174
 
159
175
  If `--visual` flag is set, or if the user has previously accepted the visual companion in this session:
160
176
  1. Launch `scripts/start-server.sh --project-dir {project_root}`
@@ -162,7 +178,7 @@ If `--visual` flag is set, or if the user has previously accepted the visual com
162
178
  3. Tell user to open the URL
163
179
  4. Visual companion will be updated after each round
164
180
 
165
- ### Step 5: Round 1 -- Independent Analysis
181
+ ### Step 6: Round 1 -- Independent Analysis
166
182
 
167
183
  Each selected agent receives:
168
184
  ```
@@ -170,7 +186,12 @@ Each selected agent receives:
170
186
  You are {agent_name}. Read your full agent definition at agents/{agent_name}.md.
171
187
 
172
188
  ## Problem
173
- {Problem restatement from Step 1}
189
+ {Problem restatement from Step 2}
190
+
191
+ ## Grounding Evidence [only present if --research is active]
192
+ {Codebase Context Summary and/or Web Research Summary from Step 0}
193
+ Use this evidence in your analysis. You may cite specific findings.
194
+ If the evidence is insufficient, note what additional information would change your position.
174
195
 
175
196
  ## Instructions
176
197
  Produce your independent analysis following your Analytical Method.
@@ -181,7 +202,7 @@ Follow your Standalone Output Format.
181
202
  **Claude Code execution**: Launch all agents as parallel subagents. Wait for all to complete.
182
203
  **Windsurf execution**: Prompt each agent role sequentially. Collect all outputs before proceeding.
183
204
 
184
- ### Step 6: Round 2 -- Cross-Examination
205
+ ### Step 7: Round 2 -- Cross-Examination
185
206
 
186
207
  Each agent receives ALL Round 1 outputs and must respond:
187
208
 
@@ -209,7 +230,7 @@ You MUST disagree with at least one position.
209
230
  **Claude Code execution**: Sequential (each agent needs to see prior cross-examinations for richer engagement).
210
231
  **Windsurf execution**: Sequential.
211
232
 
212
- ### Step 7: Enforcement Scans
233
+ ### Step 8: Enforcement Scans
213
234
 
214
235
  After Round 2, the coordinator checks:
215
236
 
@@ -220,7 +241,7 @@ After Round 2, the coordinator checks:
220
241
  5. **Novelty gate**: Did Round 2 introduce at least one idea not present in Round 1? If not, flag stale deliberation.
221
242
  6. **Groupthink flag**: If ALL agents agree on the core position, flag: "Unanimous agreement may indicate groupthink. Consider invoking `risk-analyst` or `assumption-breaker` standalone for a second opinion."
222
243
 
223
- ### Step 8: Round 3 -- Crystallization
244
+ ### Step 9: Round 3 -- Crystallization
224
245
 
225
246
  Each agent states their FINAL position:
226
247
 
@@ -233,7 +254,7 @@ If you changed your mind during cross-examination, state the new position and wh
233
254
 
234
255
  **Execution**: Parallel (Claude Code) or sequential (Windsurf). No interaction needed.
235
256
 
236
- ### Step 9: Verdict Synthesis
257
+ ### Step 10: Verdict Synthesis
237
258
 
238
259
  The coordinator synthesizes all three rounds into a structured verdict:
239
260
 
@@ -278,7 +299,7 @@ The coordinator synthesizes all three rounds into a structured verdict:
278
299
  {Questions raised during deliberation that were not resolved}
279
300
  ```
280
301
 
281
- ### Step 10: Save Output
302
+ ### Step 11: Save Output
282
303
 
283
304
  Save the full deliberation record to `deliberations/YYYY-MM-DD-HH-MM-{mode}-{slug}.md`.
284
305
 
@@ -290,10 +311,10 @@ If visual companion is active, push the verdict formation view to the browser.
290
311
 
291
312
  Quick mode skips Round 2 (cross-examination):
292
313
 
293
- 1. Steps 0-5 (same as full)
294
- 2. Skip Steps 6-7
295
- 3. Step 8: Crystallization (agents state final position based only on seeing Round 1 outputs)
296
- 4. Steps 9-10 (same as full)
314
+ 1. Steps 0-6 (same as full, including research grounding if `--research` is set)
315
+ 2. Skip Steps 7-8
316
+ 3. Step 9: Crystallization (agents state final position based only on seeing Round 1 outputs)
317
+ 4. Steps 10-11 (same as full)
297
318
 
298
319
  Quick mode is faster and cheaper but produces less refined disagreement. Use for time-sensitive decisions where the diversity of initial perspectives is more valuable than deep cross-examination.
299
320
 
@@ -303,14 +324,14 @@ Quick mode is faster and cheaper but produces less refined disagreement. Use for
303
324
 
304
325
  Duo mode runs two agents in dialectic:
305
326
 
306
- 1. Steps 0-4 (same as full, but only 2 agents)
327
+ 1. Steps 0-5 (same as full, but only 2 agents, including research grounding if `--research` is set)
307
328
  2. **Exchange Round 1**: Agent A states position (400 words). Agent B states position (400 words).
308
329
  3. **Exchange Round 2**: Agent A responds to Agent B (300 words). Agent B responds to Agent A (300 words).
309
330
  4. **Synthesis**: Coordinator synthesizes the exchange into a verdict highlighting:
310
331
  - Where the two agents agree
311
332
  - Where they disagree and why
312
333
  - What the user should weigh when deciding
313
- 5. Step 10 (save output)
334
+ 5. Step 11 (save output)
314
335
 
315
336
  Duo mode is ideal for binary decisions ("should we do X or not?"). The polarity pairs table above lists the most productive pairings.
316
337
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@faviovazquez/deliberate",
3
- "version": "0.2.1",
3
+ "version": "0.2.4",
4
4
  "description": "Multi-agent deliberation skill for AI coding assistants. Agreement is a bug.",
5
5
  "license": "MIT",
6
6
  "author": "Favio Vazquez",
@@ -27,7 +27,7 @@
27
27
  "brainstorming"
28
28
  ],
29
29
  "bin": {
30
- "deliberate": "./bin/cli.js"
30
+ "deliberate": "bin/cli.js"
31
31
  },
32
32
  "files": [
33
33
  "bin/",