@zigrivers/scaffold 3.10.1 → 3.12.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,337 @@
1
+ ---
2
+ name: spark
3
+ description: Explore a raw project idea through Socratic questioning and research
4
+ summary: "Takes a vague idea and turns it into a well-formed idea brief through probing questions, competitive research, and innovation expansion. At higher depths, dispatches multi-model research and adversarial stress-testing. Feeds directly into create-vision."
5
+ phase: null
6
+ order: null
7
+ dependencies: []
8
+ outputs: [docs/spark-brief.md]
9
+ conditional: null
10
+ stateless: true
11
+ category: tool
12
+ knowledge-base: [ideation-craft, multi-model-research-dispatch]
13
+ argument-hint: "<idea or blank for interactive>"
14
+ ---
15
+
16
+ ## Purpose
17
+
18
+ Turn a vague project idea into a well-formed idea brief through Socratic
19
+ questioning and active research. Spark is two things in one: an interviewer
20
+ that asks hard questions AND a research companion that explores the problem
21
+ space and brings back insights the user hasn't considered.
22
+
23
+ The output (`docs/spark-brief.md`) feeds into `create-vision` as optional
24
+ upstream context, accelerating the vision step without replacing it.
25
+
26
+ **Prerequisite:** Requires `scaffold init` to have been run first.
27
+
28
+ ## Inputs
29
+ - User's idea (provided via `$ARGUMENTS` or interactively)
30
+ - Existing `docs/spark-brief.md` (if rerunning — triggers update/fresh choice)
31
+ - Web search results (depth 2+, if available on the platform)
32
+ - External model responses (depth 4+, if Codex/Gemini CLI available)
33
+
34
+ ## Expected Outputs
35
+ - `docs/spark-brief.md` — directional idea brief with 8 sections
36
+
37
+ ## Rerun Detection
38
+
39
+ Before starting, check if `docs/spark-brief.md` already exists:
40
+
41
+ **If the file exists:**
42
+ 1. Read the existing brief and present a 2-3 sentence summary to the user.
43
+ 2. Ask: "Update this brief or start fresh?"
44
+ 3. **Update mode**: Use the existing brief as a starting point. Preserve content
45
+ that is still relevant. Focus on deepening, expanding, or revising specific
46
+ sections. Increment the tracking comment version (e.g., `v1` → `v2`).
47
+ 4. **Fresh mode**: Overwrite the brief entirely. Start from Phase 1.
48
+
49
+ **If the file does NOT exist:** Proceed to Phase 1.
50
+
51
+ ## Instructions
52
+
53
+ ### Phase 1: Seed
54
+
55
+ Capture and clarify the raw idea.
56
+
57
+ **If `$ARGUMENTS` is provided:** Use it as the starting idea. Confirm your
58
+ understanding with the user before proceeding.
59
+
60
+ **If `$ARGUMENTS` is blank:** Ask: "What idea do you want to explore?"
61
+
62
+ Clarify the basics through progressive questioning. Batch 2-3 related
63
+ questions per turn:
64
+
65
+ **Turn 1** — What are you building? Who is it for? What problem does it solve?
66
+
67
+ **Turn 2** — How do people solve this today? What's painful about that? How
68
+ often do they experience this pain?
69
+
70
+ **Turn 3** — Describe the person who needs this most. What are they doing the
71
+ moment before they reach for your product? What does "success" look like for them?
72
+
73
+ **Turn 4+** — Follow the gaps. If the audience is unclear, pull on that. If the
74
+ problem is well-defined but the solution is vague, focus there. Don't follow
75
+ a script — follow what's missing.
76
+
77
+ **Exit condition:** You can articulate the idea back to the user in 2-3
78
+ sentences and the user confirms "yes, that's it."
79
+
80
+ After Phase 1 completes, assess adaptive heuristics (see Adaptive Behavior
81
+ section below) to calibrate Phases 2-4 intensity.
82
+
83
+ ### Phase 2: Research
84
+
85
+ Ground the idea in reality through competitive and market research.
86
+ Thoroughness scales with the project's configured depth.
87
+
88
+ **Depth 1:** Knowledge-based reasoning only. No web search. Draw on training
89
+ data to identify the most obvious competitors and alternatives.
90
+
91
+ **Depth 2:** 1-2 quick web searches for the most obvious competitors +
92
+ knowledge-based reasoning.
93
+
94
+ **Depth 3:** 2-3 targeted searches — direct competitors, market size, and
95
+ technology landscape.
96
+
97
+ **Depth 4:** Comprehensive research + dispatch to 1 external model for
98
+ independent competitive research.
99
+
100
+ ```bash
101
+ # Check Codex or Gemini availability (see multi-model-research-dispatch knowledge)
102
+ # Prefer Gemini for research (Google search built-in)
103
+ # If unavailable, primary model does enhanced research with explicit
104
+ # competitor-analysis framing
105
+ ```
106
+
107
+ **Depth 5:** Comprehensive research + multi-model dispatch with reconciliation.
108
+ Dispatch to both Codex AND Gemini for diverse perspectives. Reconcile:
109
+ 2+ agree = consensus. Disagree = divergent — always present minority views.
110
+ Single model (fallback) = skip reconciliation labels.
111
+
112
+ **At all depths:** Bring findings INTO the conversation. Don't dump raw results —
113
+ synthesize: "I found 4 apps in this space — here's what they do well and
114
+ where they fall short."
115
+
116
+ **Exit condition (by depth):**
117
+ - Depth 1-2: At least 2 alternatives named (competitor, "do nothing," or
118
+ adjacent). User acknowledges the landscape.
119
+ - Depth 3: Direct competitors and at least 1 indirect alternative researched
120
+ with strengths/weaknesses. User acknowledges.
121
+ - Depth 4-5: Comprehensive landscape including direct, indirect, and emerging
122
+ threats. If multi-model dispatched, perspectives synthesized. User acknowledges.
123
+
124
+ ### Phase 3: Expand
125
+
126
+ Surface opportunities the user hasn't considered. Use the lightweight
127
+ expansion patterns from ideation-craft knowledge:
128
+
129
+ - Adjacent market: "Your users also need X — have you considered that?"
130
+ - Ecosystem play: "If you solve A, you're the natural place for B and C."
131
+ - Contrarian angle: "Everyone does X. What if you did the opposite?"
132
+ - Technology enabler: "A new capability makes Y possible — could that reshape
133
+ your approach?"
134
+ - AI-native rethinking: "If AI could handle Z, how would that change the product?"
135
+
136
+ **Depth scaling:**
137
+ - Depth 1-2: 1-2 expansion suggestions with brief rationale.
138
+ - Depth 3: 3-5 ideas with rationale.
139
+ - Depth 4-5: Full expansion pass leveraging Phase 2 research. Generate ideas
140
+ from existing data — no new searches in this phase.
141
+
142
+ Tag all expansion ideas as **preliminary** — the pipeline's `innovate-vision`
143
+ step does comprehensive strategic expansion later.
144
+
145
+ **Exit condition:** Present each expansion idea to the user. Each gets an
146
+ explicit disposition: **accept** (include in brief), **defer** (note as open
147
+ question), or **reject** (drop).
148
+
149
+ ### Phase 4: Challenge
150
+
151
+ Converge. Challenge every assumption surfaced in Phases 1-3, including
152
+ accepted expansion ideas.
153
+
154
+ **What to challenge:**
155
+ - **Feasibility**: Can this actually be built with the stated resources/timeline?
156
+ - **Scope**: Is this too broad? "If you could only ship ONE feature, what is it?"
157
+ - **Technical reality**: Are there hard technical constraints being glossed over?
158
+ - **Positioning**: "Three competitors already do this. What's your genuine
159
+ differentiator?"
160
+ - **Accepted expansions**: Phase 3 accepts are baseline intent. Phase 4 may
161
+ scope down or reject accepted ideas if they critically fail feasibility or
162
+ technical reality checks. Don't re-litigate the value of ideas the user
163
+ accepted — but DO challenge whether they're buildable and whether the overall
164
+ positioning holds against the competitive landscape.
165
+
166
+ **Exit condition:** Each challenged assumption is confirmed or revised by the
167
+ user. Core scope is explicitly locked — the user knows what's in and what's out.
168
+
169
+ ### Phase 5: Synthesize
170
+
171
+ Write `docs/spark-brief.md`. Create the `docs/` directory if it doesn't exist.
172
+
173
+ The brief is intentionally shallow — directional hypotheses, not validated
174
+ conclusions. Target 2-4 sentences or concise bullet points per section.
175
+ Sections may state "None identified" if inapplicable.
176
+
177
+ **At depth 1-3:** Present the brief to the user for final approval. Write the
178
+ file to disk after approval. This is the terminal phase — spark is complete.
179
+
180
+ **At depth 4+:** Generate the draft brief in conversation (not yet written to
181
+ disk). Present to the user for awareness: "Here's what I have before we
182
+ stress-test it." Then proceed to Phase 6.
183
+
184
+ Use the template in the Spark Brief Template section below.
185
+
186
+ ### Phase 6: Red-Team (depth 4+ only)
187
+
188
+ Send the draft spark brief to available external models as adversarial
189
+ reviewers.
190
+
191
+ **Depth 4:** Dispatch to 1 external model.
192
+ **Depth 5:** Dispatch to both Codex AND Gemini with reconciliation.
193
+
194
+ **Red-team prompt for external models:**
195
+
196
+ ```
197
+ You are an adversarial reviewer stress-testing a product idea brief.
198
+ Your job is to find weaknesses, challenge assumptions, and surface missed
199
+ opportunities.
200
+
201
+ SPARK BRIEF:
202
+ [Full content of the draft spark-brief.md]
203
+
204
+ CHALLENGE INSTRUCTIONS:
205
+ 1. For each section, identify the weakest assumption and explain why it might
206
+ be wrong.
207
+ 2. What competitors or market dynamics does the brief underestimate?
208
+ 3. What technical feasibility risks are glossed over?
209
+ 4. What user segments or use cases are missing?
210
+ 5. If you could only flag ONE critical risk, what would it be?
211
+
212
+ Be constructive but ruthless. Respond in structured markdown.
213
+ ```
214
+
215
+ **Execution:**
216
+
217
+ ```bash
218
+ # Codex
219
+ codex exec --skip-git-repo-check -s read-only --ephemeral "RED_TEAM_PROMPT" 2>&1
220
+
221
+ # Gemini
222
+ NO_BROWSER=true gemini -p "RED_TEAM_PROMPT" --output-format json --approval-mode yolo 2>/dev/null
223
+ ```
224
+
225
+ **If no external models available:** Fall back to primary model with distinct
226
+ "red team" system prompt. Use the three-perspective approach from
227
+ multi-model-research-dispatch knowledge (VC, competitor PM, skeptical user).
228
+
229
+ **Processing challenges:**
230
+ - Present each challenge to the user one at a time.
231
+ - For each: **accept** (update the brief), **dismiss** (explain why), or
232
+ **defer** (note as open question).
233
+ - Update the brief based on accepted challenges.
234
+
235
+ **Exit condition:** User reviews all red-team findings and gives final approval.
236
+ Write the updated brief to disk.
237
+
238
+ ### Adaptive Behavior
239
+
240
+ Assess these heuristics continuously, beginning in Phase 1. Use the idea's
241
+ characteristics to calibrate behavior across all phases:
242
+
243
+ - **Well-formed idea** → Phase 1 is brief. Move to research quickly.
244
+ - **Crowded space** → Phases 3 and 4 intensify. More expansion ideas to
245
+ differentiate, more competitive positioning challenges.
246
+ - **Novel idea (no competitors)** → Phase 2 shifts to adjacent-space and
247
+ analogous-system research. Phase 4 focuses on market-existence risk.
248
+
249
+ Phase transitions use natural conversational pivots, not mechanical
250
+ announcements. ("Now that I understand the core idea, let me research what
251
+ else is out there...")
252
+
253
+ ## Methodology Scaling
254
+
255
+ | Depth | Phase 1 (Seed) | Phase 2 (Research) | Phase 3 (Expand) | Phase 4 (Challenge) | Phase 5 (Synthesize) | Phase 6 (Red-Team) |
256
+ |-------|----------------|-------------------|-------------------|--------------------|--------------------|-------------------|
257
+ | 1 | 2-3 questions | Knowledge only, no search | 1 suggestion | Light — 1-2 key challenges | Brief, terminal | Skip |
258
+ | 2 | 2-3 questions | 1-2 quick searches + knowledge | 1-2 suggestions | Light challenge | Brief, terminal | Skip |
259
+ | 3 | 5-8 questions | 2-3 targeted searches | 3-5 ideas | Full challenge | Brief, terminal | Skip |
260
+ | 4 | 5-8 questions | Comprehensive + 1 external model | Full expansion | Full challenge | Draft, continue | 1 external model |
261
+ | 5 | 8-12 questions | Comprehensive + multi-model w/ reconciliation | Full expansion | Full challenge | Draft, continue | Multi-model + reconciliation |
262
+
263
+ **Presets:** mvp = Depth 1 | deep = Depth 5 | custom = user-specified
264
+
265
+ ## Spark Brief Template
266
+
267
+ When writing `docs/spark-brief.md`, use this exact structure:
268
+
269
+ ```markdown
270
+ <!-- scaffold:spark-brief v1 YYYY-MM-DD deep -->
271
+
272
+ # Spark Brief: [Idea Name]
273
+
274
+ > Generated by `scaffold run spark` — directional hypotheses, not validated
275
+ > conclusions. This document feeds into `create-vision` as a starting point,
276
+ > not a replacement.
277
+
278
+ ## Idea & Problem Space
279
+ [What the user wants to build, the core problem it solves, who it's for
280
+ and why they need it — from Phase 1]
281
+
282
+ ## Landscape
283
+ [Key competitors/alternatives with strengths/weaknesses, positioning,
284
+ market context — from Phase 2]
285
+
286
+ ## Expansion Ideas
287
+ [Accepted expansion ideas tagged as preliminary, deferred ideas noted —
288
+ from Phase 3]
289
+
290
+ ## Constraints & Scope
291
+ [Confirmed assumptions, scope boundaries, what's in and what's out,
292
+ locked decisions — from Phase 4]
293
+
294
+ ## Technology Opportunities
295
+ [Relevant tech enablers surfaced during research/expansion]
296
+
297
+ ## Open Questions
298
+ [Unresolved items flagged during conversation that need answers before building]
299
+
300
+ ## Risks
301
+ [Market, technical, and feasibility risks identified during challenge —
302
+ from Phase 4 and Phase 6 if red-teamed]
303
+
304
+ ## Session Metadata
305
+ - **Depth**: [1-5]
306
+ - **Red-teamed**: [yes/no]
307
+ - **Models consulted**: [list if multi-model, or "primary only"]
308
+ - **Date**: [YYYY-MM-DD]
309
+ ```
310
+
311
+ **Tracking comment format:** `<!-- scaffold:spark-brief v[N] YYYY-MM-DD [methodology] -->` where:
312
+ - `v[N]` increments on each update (v1, v2, v3...)
313
+ - `YYYY-MM-DD` is the session date
314
+ - `[methodology]` is the active methodology preset (e.g., `deep`, `mvp`, `custom`)
315
+
316
+ **Idea identity:** The idea name is captured in the `# Spark Brief: [Idea Name]` heading, not in the tracking comment. For identity matching, create-vision compares the heading's idea name against `$ARGUMENTS`.
317
+
318
+ ## How to Work With Me
319
+ - I'm your co-founder for the next few minutes. I'll challenge you AND do homework on your behalf.
320
+ - I'll ask hard questions. That's the point — weak assumptions caught now save months later.
321
+ - I'll research while we talk. When I find something relevant, I'll bring it into the conversation.
322
+ - Don't hold back on vague ideas. "Something with recipes" is a perfectly fine starting point.
323
+ - Tell me if I'm going down the wrong path. This is a conversation, not a lecture.
324
+
325
+ ## After This Step
326
+
327
+ When spark is complete, tell the user:
328
+
329
+ ---
330
+ **Spark complete** — `docs/spark-brief.md` created.
331
+
332
+ **Next:** Run `scaffold run create-vision` — the vision step will detect your
333
+ spark brief and use it as a starting point, accelerating the discovery process.
334
+
335
+ **Pipeline reference:** `scaffold run prompt-pipeline`
336
+
337
+ ---
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@zigrivers/scaffold",
3
- "version": "3.10.1",
3
+ "version": "3.12.0",
4
4
  "description": "AI-powered software project scaffolding pipeline",
5
5
  "type": "module",
6
6
  "workspaces": [
@@ -64,7 +64,7 @@ fi
64
64
 
65
65
  The `!` prefix runs the command in the user's terminal session, allowing interactive auth flows (browser OAuth, Y/n prompts) that can't work in headless mode.
66
66
 
67
- **If neither CLI is available or authenticated**: Fall back to structured Claude-only self-review. Re-read the artifact with an adversarial lens — actively try to find issues the initial review missed. Document this as "single-model review (no external CLIs available)."
67
+ **If neither CLI is available or authenticated**: Queue a compensating Claude pass focused on the failed channel's strength area. Document this as "single-model review (no external CLIs available)."
68
68
 
69
69
  ## Correct Invocation Patterns
70
70
 
@@ -134,6 +134,12 @@ NO_BROWSER=true gemini -p "REVIEW_PROMPT_HERE" --output-format json -s --approva
134
134
 
135
135
  **Output**: JSON on stdout with `{ response, stats, error }` structure.
136
136
 
137
+ ## Foreground-Only Execution
138
+
139
+ Always run Codex and Gemini CLI commands as foreground Bash calls. Never use `run_in_background`, `&`, or `nohup`. Background execution produces empty or truncated output from both CLIs. Multiple foreground calls in a single message are fine — the tool runner supports parallel invocations.
140
+
141
+ This means: when dispatching reviews, make each CLI call a separate foreground Bash tool invocation. Do NOT use shell `&` or background subshells.
142
+
137
143
  ## Context Bundling
138
144
 
139
145
  When dispatching a review, bundle all relevant context into the prompt. Each CLI gets the same bundle — do NOT share one model's review with the other.
@@ -208,35 +214,27 @@ You are reviewing a pull request diff. Report P0, P1, and P2 issues.
208
214
  | PR diff | Full diff | If >2000 lines, split into file groups |
209
215
  | Implementation plan | Task list + representative tasks | Include full task list, detail for flagged tasks |
210
216
 
211
- ## Dual-Model Reconciliation
212
-
213
- When both CLIs produce results, reconcile findings using these rules:
217
+ ## Finding Reconciliation
214
218
 
215
- | Scenario | Confidence | Action |
216
- |----------|-----------|--------|
217
- | Both flag same issue | **High** | Fix immediately — two independent models agree |
218
- | Both approve (no findings) | **High** | Proceed confidently |
219
- | One flags P0, other approves | **High** | Fix it — P0 is critical enough from a single source |
220
- | One flags P1, other approves | **Medium** | Review the finding carefully before fixing. If the finding is specific and actionable, fix it. If vague, skip. |
221
- | Models contradict each other | **Low** | Present both findings to the user for adjudication |
219
+ When multiple models produce findings, reconcile them using the rules defined in `multi-model-review-dispatch`. Key principles:
222
220
 
223
- **Independence rule**: Never share one model's review output with the other. Each model must review the artifact independently to avoid confirmation bias.
221
+ - **Independence rule**: Never share one model's review output with the other. Each model must review the artifact independently to avoid confirmation bias.
222
+ - **Round tracking**: For iterative reviews (like PR review loops), track the round number. After 3 fix rounds with unresolved findings, stop and surface the verdict (`blocked` or `needs-user-decision`) to the user. Do NOT auto-merge.
224
223
 
225
- **Round tracking**: For iterative reviews (like PR review loops), track the round number. After 3 fix rounds, merge with a warning and create a follow-up issue for remaining findings.
224
+ For the full consensus rules, confidence scoring, and disagreement resolution process, see `multi-model-review-dispatch`.
226
225
 
227
226
  ## Fallback Behavior
228
227
 
229
228
  | Situation | Fallback |
230
229
  |-----------|----------|
231
- | Neither CLI available | Structured Claude-only adversarial self-review |
232
- | Codex only | Single-model review with Codex |
233
- | Gemini only | Single-model review with Gemini |
234
- | **CLI auth expired** | **Surface to user with recovery command — do NOT silently fall back** |
235
- | One CLI fails mid-review (non-auth) | Continue with the other; note the failure in summary |
236
- | Both CLIs fail (non-auth) | Fall back to Claude-only self-review; warn user |
237
- | CLI output not parseable as JSON | Treat as text, extract findings manually |
238
-
239
- **Auth failures are NOT silent fallbacks.** The difference between "CLI not installed" (fall back quietly) and "CLI auth expired" (user action required) is critical. Auth can be fixed in 30 seconds with an interactive command — silently skipping wastes the user's review infrastructure.
230
+ | Neither CLI available | Queue two compensating Claude passes (one per missing channel's strength area). Label findings. Max verdict: `degraded-pass`. |
231
+ | Codex only | Single-model review with Codex + compensating Claude pass for Gemini |
232
+ | Gemini only | Single-model review with Gemini + compensating Claude pass for Codex |
233
+ | **CLI auth expired** | **Surface to user with `!` recovery command — do NOT silently fall back** |
234
+ | One CLI fails mid-review | Use partial results if available, else queue compensating pass. Note failure in summary. |
235
+ | Both CLIs fail | Two compensating passes, max verdict: `degraded-pass`. Warn user. |
236
+
237
+ Auth failures are NOT silent fallbacks.
240
238
 
241
239
  ## Integration with Review Steps
242
240