@zigrivers/scaffold 3.10.1 → 3.11.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,6 +1,6 @@
1
1
  # Scaffold
2
2
 
3
- A TypeScript CLI that assembles AI-powered prompts at runtime to guide you from "I have an idea" to working software. Scaffold walks you through 60 structured pipeline steps — organized into 16 phases — plus 10 utility tools, and the supported AI tools handle the research, planning, and implementation for you.
3
+ A TypeScript CLI that assembles AI-powered prompts at runtime to guide you from "I have an idea" to working software. Scaffold walks you through 60 structured pipeline steps — organized into 16 phases — plus 11 utility tools, and the supported AI tools handle the research, planning, and implementation for you.
4
4
 
5
5
  By the end, you'll have a fully planned, standards-documented, implementation-ready project with working code.
6
6
 
@@ -1381,9 +1381,10 @@ These are orthogonal to the pipeline — usable at any time, not tied to pipelin
1381
1381
  | `scaffold run review-code` | Run all 3 code review channels on local code before commit or push. |
1382
1382
  | `scaffold run review-pr` | Run all 3 code review channels (Codex CLI, Gemini CLI, Superpowers) on a PR. |
1383
1383
  | `scaffold run post-implementation-review` | Full 3-channel codebase review after an AI agent completes all tasks — checks requirements coverage, security, architecture alignment, and more. |
1384
+ | `scaffold run spark` | Explore and expand a raw project idea through Socratic questioning, competitive research, and innovation expansion. Produces a `docs/spark-brief.md` that feeds into `create-vision`. At depth 4+, dispatches to external models for independent research and adversarial red-teaming. |
1384
1385
  | `scaffold run session-analyzer` | Analyze Claude Code session logs for patterns and insights. |
1385
1386
 
1386
- Use `scaffold run review-code` before commit or push when you want a local gate on the current delivery candidate. Use `scaffold run review-pr` after a GitHub PR exists.
1387
+ Use `scaffold run spark` before `create-vision` when you have a vague idea that needs sharpening. Use `scaffold run review-code` before commit or push when you want a local gate on the current delivery candidate. Use `scaffold run review-pr` after a GitHub PR exists.
1387
1388
 
1388
1389
  Run any of these via the CLI or ask the scaffold runner skill in Claude Code or Gemini.
1389
1390
 
@@ -0,0 +1,208 @@
1
+ ---
2
+ name: multi-model-research-dispatch
3
+ description: Patterns for dispatching research and adversarial challenge to external AI models (Codex, Gemini) with reconciliation rules and single-model fallback
4
+ topics: [multi-model, research, competitive-analysis, red-team, codex, gemini, dispatch, reconciliation]
5
+ ---
6
+
7
+ # Multi-Model Research Dispatch
8
+
9
+ At higher methodology depths (4+), idea exploration and adversarial challenge benefit from independent research by external AI models. This entry provides dispatch patterns, reconciliation rules, and fallback strategies for research and red-team workflows.
10
+
11
+ ## Summary
12
+
13
+ ### When to Dispatch
14
+ | Depth | Research Dispatch | Challenge Dispatch |
15
+ |-------|-------------------|-------------------|
16
+ | 1-3 | Skip | Skip |
17
+ | 4 | 1 external model | 1 external model |
18
+ | 5 | Multi-model with reconciliation | Multi-model with reconciliation |
19
+
20
+ ### Graceful Fallback Chain
21
+ 1. Check if external CLI is available (`which codex`, `which gemini`)
22
+ 2. If available, check auth (`codex login status`, `NO_BROWSER=true gemini -p "respond with ok" -o json`)
23
+ 3. If auth succeeds, dispatch with timeout
24
+ 4. If CLI unavailable or auth fails, skip that model — note in Session Metadata
25
+ 5. If no external models available, fall back to primary model with distinct framing prompts
26
+ 6. Never block the session waiting for unavailable tools
27
+
28
+ ### Reconciliation Rules
29
+ - **2+ models agree** on the same finding = **consensus** — high confidence, present as validated
30
+ - **Models disagree** = **divergent** — present ALL perspectives including minority views. Do NOT suppress the minority. A 2-1 split where the lone dissent flags a real risk is more valuable than a comfortable consensus.
31
+ - **Single model** (fallback) = skip reconciliation labels. Present findings directly without consensus/divergent framing.
32
+
33
+ ## Deep Guidance
34
+
35
+ ### CLI Availability Check
36
+
37
+ Before dispatching, verify CLI tools are installed and authenticated:
38
+
39
+ ```bash
40
+ # Codex CLI
41
+ which codex >/dev/null 2>&1 && codex login status 2>/dev/null
42
+ # Exit 0 = ready. Non-zero = skip Codex.
43
+
44
+ # Gemini CLI
45
+ which gemini >/dev/null 2>&1 && NO_BROWSER=true gemini -p "respond with ok" -o json 2>&1
46
+ # Check for "ok" in response. Exit 41 = auth failure.
47
+ ```
48
+
49
+ If auth fails, tell the user which tool failed and how to fix it:
50
+ - Codex: "Codex auth expired — run `! codex login` to re-authenticate"
51
+ - Gemini: "Gemini auth expired — run `! gemini -p \"hello\"` to re-authenticate"
52
+
53
+ Auth failures are NOT silent fallbacks — surface them explicitly.
54
+
55
+ ### Timeout Handling
56
+
57
+ | Dispatch type | Timeout |
58
+ |---------------|---------|
59
+ | Research dispatch (idea summary + questions) | 120 seconds |
60
+ | Challenge dispatch (full brief review) | 180 seconds |
61
+
62
+ If a dispatch times out:
63
+ - Use whatever partial response was received (if parseable)
64
+ - Note the timeout in Session Metadata
65
+ - Do NOT retry — proceed with available data
66
+
67
+ ### Research Dispatch Mode
68
+
69
+ **When**: Phase 2 at depth 4-5.
70
+
71
+ **Prompt template for external model:**
72
+
73
+ ```
74
+ You are conducting independent competitive research for a product idea.
75
+
76
+ IDEA: [1-2 sentence summary of the idea from Phase 1]
77
+
78
+ RESEARCH QUESTIONS:
79
+ 1. What are the direct competitors in this space? For each, note what they do well and where they fall short.
80
+ 2. What indirect alternatives exist — different approaches to the same problem?
81
+ 3. How do users currently cope without a dedicated solution?
82
+ 4. What recent market signals exist — funding rounds, product launches, shutdowns, regulatory changes?
83
+ 5. What adjacent markets or analogous systems could inform this idea?
84
+
85
+ Be thorough and honest. Acknowledge competitor strengths — do not dismiss them.
86
+ Respond in structured markdown with one section per question.
87
+ ```
88
+
89
+ **Execution:**
90
+
91
+ ```bash
92
+ # Codex
93
+ codex exec --skip-git-repo-check -s read-only --ephemeral "RESEARCH_PROMPT" 2>&1
94
+
95
+ # Gemini
96
+ NO_BROWSER=true gemini -p "RESEARCH_PROMPT" --output-format json --approval-mode yolo 2>/dev/null
97
+ ```
98
+
99
+ **Processing results:**
100
+ - Parse the response as structured markdown
101
+ - Extract key findings per research question
102
+ - If multi-model (depth 5), run reconciliation (see below)
103
+ - Present findings to the user conversationally, not as raw output
104
+
105
+ ### Challenge Dispatch Mode (Red-Team)
106
+
107
+ **When**: Phase 6 at depth 4-5.
108
+
109
+ **Prompt template for external model:**
110
+
111
+ ```
112
+ You are an adversarial reviewer stress-testing a product idea brief.
113
+ Your job is to find weaknesses, challenge assumptions, and surface missed opportunities.
114
+
115
+ SPARK BRIEF:
116
+ [Full content of the draft spark-brief.md]
117
+
118
+ CHALLENGE INSTRUCTIONS:
119
+ 1. For each section, identify the weakest assumption and explain why it might be wrong.
120
+ 2. What competitors or market dynamics does the brief underestimate?
121
+ 3. What technical feasibility risks are glossed over?
122
+ 4. What user segments or use cases are missing?
123
+ 5. If you could only flag ONE critical risk, what would it be?
124
+
125
+ Be constructive but ruthless. The goal is to strengthen the idea, not validate it.
126
+ Respond in structured markdown with one section per challenge area.
127
+ ```
128
+
129
+ **Processing results:**
130
+ - Parse challenges from response
131
+ - Present each challenge to the user one at a time
132
+ - For each challenge, ask: "Accept (update the brief), dismiss (explain why it's not applicable), or defer (note as open question)?"
133
+ - Track dispositions and update the brief accordingly
134
+
135
+ ### Single-Model Fallback
136
+
137
+ When no external models are available, the primary model simulates multiple perspectives:
138
+
139
+ **Perspective 1 — Venture Capitalist**: "Analyze this idea as a VC evaluating a pitch. What's the market size? What's the defensibility? What are the unit economics? Would you invest?"
140
+
141
+ **Perspective 2 — Competitor's Product Lead**: "You're the product lead at [biggest competitor]. You just learned about this idea. What's your reaction? What would you do to defend your position? What aspects worry you?"
142
+
143
+ **Perspective 3 — Skeptical End User**: "You're a potential user who has tried and abandoned 3 similar products. What would make you try this one? What would make you abandon it after a week? What's the one thing that would keep you?"
144
+
145
+ Run each perspective as a separate reasoning pass. Synthesize the three viewpoints into findings the user can act on.
146
+
147
+ ### Model Selection
148
+
149
+ | Task | Recommended model | Rationale |
150
+ |------|-------------------|-----------|
151
+ | Research dispatch | Either Codex or Gemini | Both capable of web-informed reasoning |
152
+ | Challenge dispatch | Either Codex or Gemini | Adversarial analysis is model-agnostic |
153
+ | Depth 4 (1 model) | Prefer Gemini (Google search built-in) | Strongest for competitive research |
154
+ | Depth 5 (multi) | Both Codex AND Gemini | Diverse perspectives from different architectures |
155
+
156
+ ### Reconciliation Process (Depth 5)
157
+
158
+ When two or more models return research findings, reconcile them:
159
+
160
+ 1. **Extract findings**: Parse each model's response into discrete findings (one competitor, one market signal, one risk = one finding).
161
+ 2. **Match findings**: Compare findings across models. Two findings match if they reference the same entity (competitor, trend, risk) even if the wording differs.
162
+ 3. **Classify each finding**:
163
+ - **Consensus**: 2+ models independently identified the same finding. High confidence.
164
+ - **Divergent**: Models disagree about the same entity (e.g., one says competitor X is strong, another says X is weak). Present both perspectives with reasoning.
165
+ - **Unique**: Only one model surfaced this finding. Not necessarily wrong — may be the most valuable insight. Present it without discounting.
166
+ 4. **Synthesize for the user**: Present findings grouped by classification. Lead with consensus (highest confidence), then unique (potential insights), then divergent (needs user judgment).
167
+ 5. **Never suppress minority views**: A lone model flagging a risk that others missed may be the most important finding in the entire research pass.
168
+
169
+ ### Quality Gates
170
+
171
+ Before presenting research findings to the user, verify:
172
+
173
+ - At least 2 competitors or alternatives identified (even at depth 4 with single model)
174
+ - Each competitor has both a strength and a weakness documented
175
+ - The "do nothing" option is addressed (how users cope without any tool)
176
+ - Market timing signals are present (why now?)
177
+ - If multi-model: reconciliation labels (consensus/divergent/unique) are applied
178
+
179
+ ### Common Anti-Patterns
180
+
181
+ | Anti-pattern | Problem | Fix |
182
+ |-------------|---------|-----|
183
+ | Dismissing competitors | "They're not really competition" — every alternative is competition | Acknowledge strengths honestly |
184
+ | Echo chamber | Both models agree because both drew from the same training data | Look for unique findings, not just consensus |
185
+ | Recency bias | Focusing only on recent launches, ignoring established players | Include both established and emerging competitors |
186
+ | Feature-list comparison | Comparing feature lists instead of positioning | Compare on audience, value prop, and differentiation |
187
+ | Silent fallback | External model fails, no mention in output | Always note which models were used and any failures |
188
+ | Over-synthesis | Merging distinct findings into one summary, losing nuance | Preserve individual findings before synthesizing |
189
+
190
+ ### Output Format
191
+
192
+ When presenting research findings to the user, structure them as:
193
+
194
+ **Competitive Landscape:**
195
+ - [Competitor 1]: Strengths — [specifics]. Weaknesses — [specifics]. Why users choose them — [specifics].
196
+ - [Competitor 2]: ...
197
+ - "Do nothing" option: How users cope today — [specifics]. Why it's insufficient — [specifics].
198
+
199
+ **Market Signals:**
200
+ - [Signal 1]: [What happened, when, why it matters for this idea]
201
+ - [Signal 2]: ...
202
+
203
+ **Expansion Opportunities** (from adjacent market research):
204
+ - [Opportunity 1]: [What it is, why it's relevant, how it connects]
205
+
206
+ **Red-Team Challenges** (from adversarial review):
207
+ - [Challenge 1]: [Weakness identified, why it matters, recommended action]
208
+ - Disposition: [accept/dismiss/defer — tracked after user response]
@@ -0,0 +1,100 @@
1
+ ---
2
+ name: game-ideation
3
+ description: Game-specific ideation techniques for spark — core loop, player fantasy, retention, session design, monetization
4
+ topics: [game-dev, ideation, core-loop, player-fantasy, retention, monetization, session-design]
5
+ ---
6
+
7
+ Game ideation applies game-specific lenses — core loop, player fantasy, retention mechanics, session design, and monetization — to the spark tool's ideation flow. It supplements the general ideation-craft entry when a user is exploring a game idea.
8
+
9
+ ## Summary
10
+
11
+ ### Game Ideation Lenses
12
+ Five lenses to apply during idea exploration: **Core loop** (what the player does every 30 seconds), **Player fantasy** (the emotional experience, not mechanics), **Retention** (what brings players back), **Session design** (how long and how satisfying), **Monetization** (how the game sustains itself).
13
+
14
+ ### Quick Tests
15
+ - **Core loop**: Can you describe it in one sentence without "and"?
16
+ - **Player fantasy**: Does every major mechanic reinforce it?
17
+ - **Retention**: What happens if the player leaves for a week?
18
+
19
+ ## Deep Guidance
20
+
21
+ ### Core Loop Identification
22
+ - **What is the core loop?** The repeating cycle of actions the player performs most often. In a shooter: aim → shoot → loot → repeat. In a puzzle game: observe → plan → execute → evaluate → repeat.
23
+ - **Ask the user**: "What does the player do every 30 seconds? Every 5 minutes? Every session?"
24
+ - **Test**: Can you describe the core loop in one sentence without using the word "and"? If not, it's too complex or undefined.
25
+
26
+ ### Player Fantasy
27
+ - **What fantasy does the player live out?** Not the game mechanics — the emotional experience. "I am a powerful wizard" not "I cast spells with mana."
28
+ - **Ask the user**: "When the player tells their friend about your game, what do they say it feels like?"
29
+ - **Test**: Does every major mechanic reinforce the fantasy? If a mechanic exists but doesn't serve the fantasy, question why it's there.
30
+
31
+ ### Retention Mechanics
32
+ - **Session hooks**: What brings the player back tomorrow? (Daily rewards, story cliffhangers, social obligations, unfinished goals)
33
+ - **Progression**: What does the player invest that makes leaving costly? (Character levels, base building, collection progress, social reputation)
34
+ - **Ask the user**: "What happens if the player doesn't open the game for a week? Do they lose anything? Miss anything?"
35
+
36
+ ### Session Design
37
+ - **Session length**: How long is a typical play session? (Mobile: 3-5 min. PC: 30-90 min. Console: 60+ min.)
38
+ - **Session arc**: Does each session have a beginning, middle, and satisfying end? Can the player stop mid-session without frustration?
39
+ - **Ask the user**: "Where and when does your player play? Commute? Couch? Desk? This determines session length."
40
+
41
+ ### Monetization Models
42
+ - **Premium**: Pay once, play forever. Best for narrative, creative, or skill-based games.
43
+ - **Free-to-play**: Free entry, monetize through cosmetics, battle pass, or convenience. Best for multiplayer/social games.
44
+ - **Subscription**: Recurring payment for ongoing content. Best for live-service games.
45
+ - **Ask the user**: "How does your player feel about spending money in your game? What would they pay for? What would feel unfair?"
46
+
47
+ ### Applying Game Lenses During Spark Phases
48
+
49
+ **Phase 1 (Seed)**: Ask about the core loop and player fantasy early. These are the foundation — if they're unclear, everything else is built on sand.
50
+
51
+ **Phase 2 (Research)**: Research competitors through a game lens. For each competitor: What's their core loop? What fantasy do they deliver? How do they monetize? What's their session design? Where do player reviews complain?
52
+
53
+ **Phase 3 (Expand)**: Use game-specific expansion angles:
54
+ - "What if the core loop had a social/multiplayer dimension?"
55
+ - "What if you added a metagame layer on top of the core loop?"
56
+ - "What platform would change the experience most? (Mobile → PC, or vice versa)"
57
+ - "What if monetization was through player-created content?"
58
+
59
+ **Phase 4 (Challenge)**: Challenge through game-specific risk lenses:
60
+ - "Core loop fatigue — will this still be fun after 100 hours?"
61
+ - "Monetization pressure — does the business model conflict with the player fantasy?"
62
+ - "Scope vs. team — can a [team size] team build this in [timeline]?"
63
+ - "Platform expectations — does the session design match the platform's usage patterns?"
64
+
65
+ ### Game-Specific Brief Sections
66
+
67
+ When writing the spark brief for a game idea, adapt sections:
68
+ - **Idea & Problem Space** → Include the core loop and player fantasy
69
+ - **Landscape** → Frame competitors by core loop and fantasy, not just features
70
+ - **Expansion Ideas** → Tag which ideas affect the core loop vs. metagame vs. content
71
+ - **Risks** → Include core loop fatigue, monetization/fantasy tension, and scope risks
72
+
73
+ ### Scoping by Project Scale
74
+
75
+ | Scale | Core loop | Content depth | Monetization | Session design |
76
+ |-------|-----------|---------------|-------------|----------------|
77
+ | Game jam (48-72h) | One mechanic, tight loop | Minimal — procedural or template | None (free) | 5-15 min total |
78
+ | Indie (solo/small team) | 1-2 mechanics, polished | Handcrafted, limited scope | Premium or F2P with cosmetics | 15-60 min sessions |
79
+ | AA/studio | Multiple interlocking systems | Extensive content pipeline | Any model, balanced | Platform-appropriate |
80
+
81
+ ### Common Game Ideation Anti-Patterns
82
+
83
+ - **The Kitchen Sink**: Trying to combine too many mechanics before any one is fun. Focus the core loop first.
84
+ - **Fantasy Mismatch**: The monetization model undermines the player fantasy. (Pay-to-win in a skill-based competitive game.)
85
+ - **Platform Blindness**: Designing a 90-minute session game for mobile, or a 3-minute session for PC/console.
86
+ - **Retention Treadmill**: Relying on FOMO and daily login rewards instead of intrinsic motivation. Players resent obligation.
87
+ - **Scope Denial**: "We'll just add multiplayer later." Multiplayer is an architecture decision, not a feature toggle.
88
+ - **Clone Trap**: "Like [popular game] but with [small twist]." The twist must be fundamental enough to justify switching costs.
89
+
90
+ ### Core Loop Evaluation Worksheet
91
+
92
+ When evaluating a proposed core loop, walk through these questions:
93
+
94
+ 1. **Primary loop**: What does the player do every 30 seconds? Is it inherently satisfying?
95
+ 2. **Secondary loop**: What does the player do every 5 minutes? Does it give meaning to the primary loop?
96
+ 3. **Tertiary loop**: What does the player do every session? Does it create a sense of progress?
97
+ 4. **Friction test**: Remove one mechanic from the loop. Does the game still work? If yes, that mechanic may be unnecessary.
98
+ 5. **Fantasy alignment**: Does every step in the loop reinforce the player fantasy? If a step breaks immersion, redesign it.
99
+ 6. **Depth test**: Can a skilled player execute the loop differently than a novice? If not, the loop may lack depth.
100
+ 7. **Social test**: Would watching someone else do this loop be entertaining? If not, the loop may lack spectacle or surprise.
@@ -0,0 +1,209 @@
1
+ ---
2
+ name: ideation-craft
3
+ description: Questioning techniques, research methodology, lightweight expansion patterns, and brief synthesis for early-stage idea exploration
4
+ topics: [ideation, questioning, research, competitive-analysis, brief-synthesis, socratic-method]
5
+ ---
6
+
7
+ # Ideation Craft
8
+
9
+ Ideation craft covers the questioning, research, and synthesis techniques used during early-stage idea exploration. It guides a conversational flow from a raw idea through competitive research to a structured idea brief.
10
+
11
+ ## Summary
12
+
13
+ ### Key Techniques
14
+ - **Questioning**: Socratic method (what → who → why → why not), 5 Whys for root cause, "What would have to be true?" for assumptions. Batch 2-3 questions per turn.
15
+ - **Research**: Scan direct competitors, indirect alternatives, and the "do nothing" option. Capture strengths, weaknesses, positioning per competitor. Check adjacent markets and market timing.
16
+ - **Expansion**: Lightweight one-liner prompts — adjacent markets, ecosystem plays, contrarian angles, tech enablers, AI-native rethinking. These are conversation starters, not full strategic methodology.
17
+ - **Synthesis**: 2-4 sentences per brief section. Tag confidence: validated, hypothesized, or speculative. Never fabricate — write "None identified" for empty sections.
18
+
19
+ ## Deep Guidance
20
+
21
+ ### Questioning Techniques
22
+
23
+ - **Socratic method**: Ask progressively deeper questions. Start with "what" (the idea), move to "who" (the audience), then "why" (the problem), then "why not" (the assumptions).
24
+ - **The 5 Whys**: When the user states a problem, ask "why?" five times to reach the root cause. Surface-level problems hide deeper opportunities.
25
+ - **"What would have to be true?"**: For every assumption, ask what conditions must hold for it to work. This surfaces hidden dependencies and risks.
26
+ - **Batching**: Group 2-3 related questions per turn. Don't pepper the user with single questions (wastes turns) or overwhelm with 10 at once (causes shallow answers).
27
+
28
+ ### Progressive Questioning Framework
29
+
30
+ **Turn 1 — Capture the spark**: What are you building? Who is it for? What problem does it solve?
31
+
32
+ **Turn 2 — Dig into the problem**: How do people solve this today? What's painful about the current approach? How often do they experience this pain?
33
+
34
+ **Turn 3 — Understand the audience**: Describe the person who needs this most. What are they doing the moment before they reach for your product? What does "success" look like from their perspective?
35
+
36
+ **Turn 4 — Challenge assumptions**: You said [X] — what evidence do you have? What would have to be true for [Y] to work? If [Z] turned out to be wrong, would the idea still make sense?
37
+
38
+ **Turn 5+ — Deepen based on gaps**: Follow the thread. If the audience is unclear, keep pulling on that. If the problem is well-defined but the solution is vague, focus there. Don't follow a script — follow the gaps.
39
+
40
+ ### Research Methodology
41
+
42
+ - **Competitor scan**: Search for direct competitors (same problem, same audience), indirect alternatives (different approach, same problem), and the "do nothing" option (how users cope today).
43
+ - **What to capture per competitor**: Name, what they do well (be specific), where they fall short (be honest), pricing model, target audience, and why a user might choose them over the idea.
44
+ - **Adjacent markets**: Look for products solving related problems for the same audience, or the same problem for a different audience. These are expansion opportunities.
45
+ - **Market timing**: Why now? What changed (technology, regulation, culture, behavior) that makes this idea viable today when it wasn't before?
46
+
47
+ ### Expansion Patterns (Lightweight)
48
+
49
+ - **Adjacent market**: "Your users also need X — have you considered expanding into that?"
50
+ - **Ecosystem play**: "If you solve A, you become the natural place to also solve B and C."
51
+ - **Contrarian angle**: "Everyone in this space does X. What if you deliberately did the opposite?"
52
+ - **Technology enabler**: "A new capability (API, model, platform) makes Y possible now — could that reshape your approach?"
53
+ - **AI-native rethinking**: "If you assumed AI could handle Z, how would that change the product?"
54
+
55
+ These are conversation starters for Phase 3 (Expand), not full strategic methodology. The pipeline's `innovate-vision` step covers comprehensive strategic expansion later.
56
+
57
+ ### Brief Synthesis
58
+
59
+ - A good directional hypothesis names a specific audience, problem, and approach — not vague aspirations.
60
+ - Bad: "This app will help people be more productive." Good: "Freelance designers who lose 5+ hours/week to invoice tracking — a tool that auto-generates invoices from their time-tracking data."
61
+ - Tag confidence levels: "validated" (user confirmed + research supports), "hypothesized" (user stated but unresearched), "speculative" (surfaced during expansion, unconfirmed).
62
+ - Each brief section should be 2-4 sentences or concise bullet points. If a section has nothing, write "None identified" — don't fabricate.
63
+
64
+ ### Competitive Research Process
65
+
66
+ 1. **Start with the obvious**: Search for "[problem] app" or "[problem] tool." The first 5-10 results are the landscape the user will compete against.
67
+ 2. **Check review sites**: App Store reviews, G2, Capterra, ProductHunt comments. Users complain about exactly the gaps a new product can fill.
68
+ 3. **Look for failures**: Search "[category] startup failed" or "[competitor] shutdown." Failed attempts tell you what didn't work and why.
69
+ 4. **Find the "do nothing" option**: How do people cope without any tool? Spreadsheets, manual processes, asking friends? This is often the biggest competitor.
70
+ 5. **Assess timing**: Search for recent news, funding rounds, regulatory changes, or technology launches in the space. Timing explains why an idea works now when it didn't before.
71
+
72
+ ### Framing Research for External Model Dispatch
73
+
74
+ When dispatching to an external model for competitive research (depth 4+), frame the prompt as:
75
+
76
+ > "Research the competitive landscape for [idea summary]. Identify: (1) Direct competitors solving the same problem for the same audience, (2) Indirect alternatives — different approaches to the same problem, (3) The 'do nothing' option — how users cope today, (4) Recent market signals — funding, launches, shutdowns, regulatory changes. For each competitor, note what they do well and where they fall short. Be honest — acknowledge genuine strengths."
77
+
78
+ ### Brief Section Guidance
79
+
80
+ | Section | Source Phase | What to write | Common mistakes |
81
+ |---------|-------------|---------------|-----------------|
82
+ | Idea & Problem Space | Phase 1 (Seed) | Core idea, specific problem, target audience, why they need it | Too vague ("helps people"), no audience specificity |
83
+ | Landscape | Phase 2 (Research) | 2-5 competitors with strengths/weaknesses, positioning | Dismissing competitors, listing without analysis |
84
+ | Expansion Ideas | Phase 3 (Expand) | Accepted ideas tagged as preliminary, deferred ideas noted | Treating preliminary as committed scope |
85
+ | Constraints & Scope | Phase 4 (Challenge) | Confirmed assumptions, what's in/out, locked decisions | Scope too broad, no explicit "out" list |
86
+ | Technology Opportunities | Phase 2-3 | Tech enablers discovered during research/expansion | Listing technologies without explaining why they matter |
87
+ | Open Questions | All phases | Unresolved items that need answers before building | Ignoring questions that feel uncomfortable |
88
+ | Risks | Phase 4 (Challenge) | Market, technical, feasibility risks with severity | Only listing technical risks, ignoring market risks |
89
+
90
+ ### Audience Definition Techniques
91
+
92
+ Avoid demographic-only definitions ("18-35 year old professionals"). Instead, define audiences by behavior and motivation:
93
+
94
+ **Behavior-based**: "People who currently track expenses in a spreadsheet because existing apps are too complex."
95
+ **Motivation-based**: "Freelancers who want to spend less than 10 minutes per week on invoicing so they can focus on client work."
96
+ **Context-based**: "The moment someone finishes a client project and thinks 'now I have to figure out the invoice' — that's when they need this."
97
+
98
+ **Questions to sharpen audience definition:**
99
+ - What is this person doing the moment before they reach for your product?
100
+ - What is the last thing they tried? Why did it fail them?
101
+ - How would they describe their problem to a friend (not in your language — in theirs)?
102
+ - If you could only serve ONE type of user, who would it be and why?
103
+
104
+ ### Problem Validation Framework
105
+
106
+ Before accepting a problem statement, test it:
107
+
108
+ 1. **Specificity test**: Can you name a real person (or type of person) who has this problem? If "everyone has this problem," it's too vague.
109
+ 2. **Frequency test**: How often does this problem occur? Daily problems are more valuable than annual ones.
110
+ 3. **Severity test**: When this problem occurs, how painful is it? Mild inconvenience or hair-on-fire emergency?
111
+ 4. **Workaround test**: How do people cope today? If they have a workable (even if imperfect) solution, your product must be dramatically better.
112
+ 5. **Willingness test**: Would someone pay money / change habits / switch tools to solve this? If not, the problem may not be valuable enough.
113
+
114
+ ### Scope Sharpening Techniques
115
+
116
+ When the idea is too broad, use these techniques to find the core:
117
+
118
+ - **The one-feature test**: "If your product could only do ONE thing, what would it be?" This reveals the core value proposition.
119
+ - **The removal test**: "If you removed [feature X], would anyone still use the product?" If yes, X is not core.
120
+ - **The first-user test**: "Who is the first person who would use this, and what exactly would they do with it?" This grounds abstract ideas in concrete behavior.
121
+ - **The MVP boundary**: "What is the smallest thing you could build that would make one person's life measurably better?" This defines the initial scope.
122
+ - **The anti-scope list**: Explicitly list what the product does NOT do. This is as important as what it does.
123
+
124
+ ### Positioning Against Competitors
125
+
126
+ When the landscape is crowded, help the user find genuine differentiation:
127
+
128
+ - **Head-to-head**: "Competitor X does this well. You would need to be 10x better at this specific thing to win users away. Can you be?"
129
+ - **Underserved segment**: "Competitor X serves enterprise. Is there an underserved segment (freelancers, students, non-profits) that you could own?"
130
+ - **Different job**: "Competitor X solves problem A. Could you solve a related but different problem B for the same audience?"
131
+ - **Channel advantage**: "Competitor X requires a desktop app. Could you win by being mobile-first, browser-based, or embedded in an existing workflow?"
132
+ - **Timing advantage**: "What has changed (new technology, regulation, cultural shift) that makes your approach viable now when it wasn't when competitors launched?"
133
+
134
+ ### Ideation Anti-Patterns
135
+
136
+ | Anti-pattern | What it sounds like | Why it's dangerous | How to challenge |
137
+ |-------------|--------------------|--------------------|-----------------|
138
+ | Solution-first | "I want to build an app that..." | Skips the problem entirely | "What problem does this solve? For whom?" |
139
+ | Everyone-needs-this | "Everyone could use this" | No target audience = no product | "Who needs this MOST? Who would pay?" |
140
+ | Feature soup | "It'll do X and Y and Z and..." | No core value proposition | "Remove one feature. Does it still work?" |
141
+ | Competitor blindness | "Nobody else does this" | Almost certainly false | "How do people solve this today?" |
142
+ | Technology hammer | "I learned [tech] and want to use it" | Technology seeking a problem | "Forget the tech. What problem exists?" |
143
+ | Scale fantasy | "Once we have millions of users..." | Ignores the path to the first user | "How do you get user #1? User #10?" |
144
+ | Uniqueness obsession | "We need a totally new idea" | Execution beats novelty almost always | "What existing idea could you execute 10x better?" |
145
+
146
+ ### Worked Example: From Vague to Sharp
147
+
148
+ **Vague starting point**: "An app for recipes"
149
+
150
+ **After Phase 1 (Seed):**
151
+ - Who: Home cooks who meal prep on weekends but waste food because they buy ingredients for recipes they never make.
152
+ - Problem: Planning meals for the week takes 45+ minutes, and existing apps have 50,000 recipes but no help deciding which ones to cook together.
153
+ - Core idea: A meal planning tool that suggests complementary recipes sharing ingredients, minimizing waste and shopping time.
154
+
155
+ **After Phase 2 (Research):**
156
+ - Competitors: Mealime (good UI but no ingredient overlap), Paprika (great for saving recipes but no planning), Eat This Much (calorie-focused, not taste-focused).
157
+ - Gap: No tool optimizes for ingredient reuse across a week of meals.
158
+
159
+ **After Phase 3 (Expand):**
160
+ - Accepted: Grocery list auto-generation from the meal plan (directly supports core value).
161
+ - Deferred: Social sharing of meal plans (not core, revisit later).
162
+ - Rejected: Calorie tracking (different problem, different audience).
163
+
164
+ **After Phase 4 (Challenge):**
165
+ - Confirmed: The ingredient-overlap algorithm is the differentiator.
166
+ - Revised: Scope down from "all cuisines" to "weeknight dinners, 30 min or less" for MVP.
167
+ - Locked out: No restaurant recommendations, no diet tracking, no social features for v1.
168
+
169
+ This progression from "an app for recipes" to a tightly scoped meal planning tool with a clear differentiator is what a good spark session produces.
170
+
171
+ ### Confidence Tagging Guide
172
+
173
+ Every claim in the spark brief should carry an implicit confidence level. This helps `create-vision` know what to validate vs. what to build on.
174
+
175
+ **Validated** (highest confidence):
176
+ - User stated it AND research supports it.
177
+ - Example: "3 competitors exist in this space" (user said, you verified via search).
178
+ - create-vision can build on this without re-exploring.
179
+
180
+ **Hypothesized** (medium confidence):
181
+ - User stated it but it hasn't been independently verified.
182
+ - Example: "Target users are freelance designers" (user's claim, no research to confirm market size).
183
+ - create-vision should probe deeper on these — targeted follow-up questions.
184
+
185
+ **Speculative** (lowest confidence):
186
+ - Surfaced during expansion or challenge, not yet confirmed by user.
187
+ - Example: "Meal planning apps retain 3x better than recipe apps" (research finding, user hasn't decided whether to pivot).
188
+ - create-vision should present these as open questions, not assumptions.
189
+
190
+ **How to apply in the brief:**
191
+ - Don't tag every sentence explicitly (clutters the document).
192
+ - Tag at the section level: "This section is largely validated — user confirmed the audience and research supports the competitive gap."
193
+ - Call out speculative items explicitly: "Note: the social sharing angle is speculative — surfaced during expansion, not yet confirmed."
194
+
195
+ ### Market Timing Analysis
196
+
197
+ When assessing "why now?", look for these signals:
198
+
199
+ **Technology shifts**: A new API, platform, or capability that makes something possible (or dramatically cheaper) that wasn't before. Example: LLMs making personalized recommendation affordable for indie tools.
200
+
201
+ **Regulatory changes**: New laws or standards that create demand or remove barriers. Example: GDPR creating demand for privacy-first alternatives.
202
+
203
+ **Behavioral changes**: Shifts in how people work, communicate, or consume. Example: Remote work increasing demand for async collaboration tools.
204
+
205
+ **Market failures**: Recent shutdowns, pivots, or public failures that leave an underserved audience. Example: A popular tool raising prices 10x, driving users to seek alternatives.
206
+
207
+ **Cultural shifts**: Changing attitudes that make new products viable. Example: Growing sustainability awareness creating demand for waste-reduction tools.
208
+
209
+ Each timing signal should be specific and verifiable — not "AI is trending" but "GPT-4's function calling API, launched in June 2023, makes it possible to build structured data extraction at 1/100th the cost of custom NLP pipelines."
@@ -112,6 +112,8 @@ knowledge-overrides:
112
112
  append: [game-design-document]
113
113
  critical-path-walkthrough:
114
114
  append: [game-design-document]
115
+ spark:
116
+ append: [game-ideation]
115
117
 
116
118
  # ---------------------------------------------------------------------------
117
119
  # reads-overrides
@@ -24,6 +24,7 @@ about ecosystem maturity, alternatives, and gotchas.
24
24
  ## Inputs
25
25
  - docs/plan.md (required) — PRD features, integrations, and technical requirements
26
26
  - User preferences (gathered via questions) — language, framework, deployment target, constraints
27
+ - docs/spark-brief.md (optional) — Technology Opportunities section from spark ideation session. If present and not stale (compare tracking comment date against docs/vision.md and docs/plan.md — if the brief predates both, ignore it), use the Technology Opportunities section as supplementary research context when evaluating technology options.
27
28
 
28
29
  ## Expected Outputs
29
30
  - docs/tech-stack.md — complete technology reference with architecture overview,
@@ -19,6 +19,7 @@ throughout the entire pipeline.
19
19
 
20
20
  ## Inputs
21
21
  - Project idea (provided by user verbally or in a brief)
22
+ - docs/spark-brief.md (optional) — upstream context from spark ideation session
22
23
  - Existing project files (if brownfield — any README, docs, or code)
23
24
  - Market context or competitive research (if available)
24
25
 
@@ -103,11 +104,53 @@ Before starting, check if `docs/vision.md` already exists:
103
104
  - **Related docs**: `docs/plan.md`
104
105
  - **Special rules**: Never change guiding principles without user approval. Preserve any strategic decisions that were explicitly made by the user.
105
106
 
107
+ ### Spark Brief Detection
108
+
109
+ **If `docs/spark-brief.md` exists**: Read it completely. Check its tracking
110
+ comment date against the `docs/vision.md` tracking comment date (if vision
111
+ exists). If the brief predates the current vision, ignore it and note:
112
+ "Spark brief found but predates current vision — ignoring." Check the
113
+ brief's heading (`# Spark Brief: [Idea Name]`) against the current
114
+ `$ARGUMENTS` — if the idea name appears unrelated, ask the user before
115
+ using it.
116
+
117
+ Otherwise, this is upstream context from a spark ideation session — the user
118
+ has already explored the problem space, researched competitors, expanded the
119
+ idea, and challenged assumptions.
120
+
121
+ **Accelerated mode**: Use the brief's answers as a baseline and ask targeted
122
+ follow-up questions to expand them to create-vision's required depth. Do not
123
+ skip phases — deepen and validate the brief's hypotheses rather than
124
+ re-exploring from scratch.
125
+
126
+ If the brief was red-teamed (Session Metadata), treat its competitive
127
+ landscape and risk sections as pre-validated hypotheses — focus discovery on
128
+ gaps or updates rather than re-exploring those areas.
129
+
130
+ create-vision uses its own configured depth regardless of the brief's depth.
131
+ The brief's depth metadata is informational — it tells you how thoroughly
132
+ the idea was explored, not how thorough this vision step should be.
133
+
134
+ Defer the brief's "Technology Opportunities" section to downstream phases
135
+ (tech-stack, architecture) — the vision document is about purpose and positioning,
136
+ not technical implementation.
137
+
138
+ **If `docs/spark-brief.md` does NOT exist**: Proceed normally.
139
+
106
140
  ## Here's my idea:
107
141
  $ARGUMENTS
108
142
 
109
143
  ## Phase 1: Strategic Discovery
110
144
 
145
+ ### Spark Brief Context
146
+
147
+ **If `docs/spark-brief.md` was read during Spark Brief Detection above**, use
148
+ it as your baseline for this phase. Do not skip phases — use the brief's
149
+ answers as a starting point and ask targeted follow-up questions to deepen
150
+ and validate the brief's hypotheses to create-vision's required depth.
151
+
152
+ **If no spark brief exists**, proceed normally with the discovery questions below.
153
+
111
154
  Use AskUserQuestionTool throughout this phase. Batch related questions together — don't ask one at a time.
112
155
 
113
156
  ### Understand the Problem Space
@@ -159,6 +159,7 @@ Print the following reference directly. Do not read any files or run any command
159
159
  | **Resume (multi)** | `/scaffold:multi-agent-resume <agent-name>` | Resuming a worktree agent after a break |
160
160
  | **Version Bump** | `/scaffold:version-bump` | Bump version + changelog (no tag/release) |
161
161
  | **Release** | `/scaffold:release` | Project-defined release ceremony with changelog + relevant release artifacts |
162
+ | **Spark** | `/scaffold:spark` | Explore and expand a raw project idea |
162
163
  | **Visual Dashboard** | `/scaffold:dashboard` | HTML pipeline overview in browser |
163
164
 
164
165
  ## Process Rules
@@ -0,0 +1,337 @@
1
+ ---
2
+ name: spark
3
+ description: Explore a raw project idea through Socratic questioning and research
4
+ summary: "Takes a vague idea and turns it into a well-formed idea brief through probing questions, competitive research, and innovation expansion. At higher depths, dispatches multi-model research and adversarial stress-testing. Feeds directly into create-vision."
5
+ phase: null
6
+ order: null
7
+ dependencies: []
8
+ outputs: [docs/spark-brief.md]
9
+ conditional: null
10
+ stateless: true
11
+ category: tool
12
+ knowledge-base: [ideation-craft, multi-model-research-dispatch]
13
+ argument-hint: "<idea or blank for interactive>"
14
+ ---
15
+
16
+ ## Purpose
17
+
18
+ Turn a vague project idea into a well-formed idea brief through Socratic
19
+ questioning and active research. Spark is two things in one: an interviewer
20
+ that asks hard questions AND a research companion that explores the problem
21
+ space and brings back insights the user hasn't considered.
22
+
23
+ The output (`docs/spark-brief.md`) feeds into `create-vision` as optional
24
+ upstream context, accelerating the vision step without replacing it.
25
+
26
+ **Prerequisite:** Requires `scaffold init` to have been run first.
27
+
28
+ ## Inputs
29
+ - User's idea (provided via `$ARGUMENTS` or interactively)
30
+ - Existing `docs/spark-brief.md` (if rerunning — triggers update/fresh choice)
31
+ - Web search results (depth 2+, if available on the platform)
32
+ - External model responses (depth 4+, if Codex/Gemini CLI available)
33
+
34
+ ## Expected Outputs
35
+ - `docs/spark-brief.md` — directional idea brief with 8 sections
36
+
37
+ ## Rerun Detection
38
+
39
+ Before starting, check if `docs/spark-brief.md` already exists:
40
+
41
+ **If the file exists:**
42
+ 1. Read the existing brief and present a 2-3 sentence summary to the user.
43
+ 2. Ask: "Update this brief or start fresh?"
44
+ 3. **Update mode**: Use the existing brief as a starting point. Preserve content
45
+ that is still relevant. Focus on deepening, expanding, or revising specific
46
+ sections. Increment the tracking comment version (e.g., `v1` → `v2`).
47
+ 4. **Fresh mode**: Overwrite the brief entirely. Start from Phase 1.
48
+
49
+ **If the file does NOT exist:** Proceed to Phase 1.
50
+
51
+ ## Instructions
52
+
53
+ ### Phase 1: Seed
54
+
55
+ Capture and clarify the raw idea.
56
+
57
+ **If `$ARGUMENTS` is provided:** Use it as the starting idea. Confirm your
58
+ understanding with the user before proceeding.
59
+
60
+ **If `$ARGUMENTS` is blank:** Ask: "What idea do you want to explore?"
61
+
62
+ Clarify the basics through progressive questioning. Batch 2-3 related
63
+ questions per turn:
64
+
65
+ **Turn 1** — What are you building? Who is it for? What problem does it solve?
66
+
67
+ **Turn 2** — How do people solve this today? What's painful about that? How
68
+ often do they experience this pain?
69
+
70
+ **Turn 3** — Describe the person who needs this most. What are they doing the
71
+ moment before they reach for your product? What does "success" look like for them?
72
+
73
+ **Turn 4+** — Follow the gaps. If the audience is unclear, pull on that. If the
74
+ problem is well-defined but the solution is vague, focus there. Don't follow
75
+ a script — follow what's missing.
76
+
77
+ **Exit condition:** You can articulate the idea back to the user in 2-3
78
+ sentences and the user confirms "yes, that's it."
79
+
80
+ After Phase 1 completes, assess adaptive heuristics (see Adaptive Behavior
81
+ section below) to calibrate Phases 2-4 intensity.
82
+
83
+ ### Phase 2: Research
84
+
85
+ Ground the idea in reality through competitive and market research.
86
+ Thoroughness scales with the project's configured depth.
87
+
88
+ **Depth 1:** Knowledge-based reasoning only. No web search. Draw on training
89
+ data to identify the most obvious competitors and alternatives.
90
+
91
+ **Depth 2:** 1-2 quick web searches for the most obvious competitors +
92
+ knowledge-based reasoning.
93
+
94
+ **Depth 3:** 2-3 targeted searches — direct competitors, market size, and
95
+ technology landscape.
96
+
97
+ **Depth 4:** Comprehensive research + dispatch to 1 external model for
98
+ independent competitive research.
99
+
100
+ ```bash
101
+ # Check Codex or Gemini availability (see multi-model-research-dispatch knowledge)
102
+ # Prefer Gemini for research (Google search built-in)
103
+ # If unavailable, primary model does enhanced research with explicit
104
+ # competitor-analysis framing
105
+ ```
106
+
107
+ **Depth 5:** Comprehensive research + multi-model dispatch with reconciliation.
108
+ Dispatch to both Codex AND Gemini for diverse perspectives. Reconcile:
109
+ 2+ agree = consensus. Disagree = divergent — always present minority views.
110
+ Single model (fallback) = skip reconciliation labels.
111
+
112
+ **At all depths:** Bring findings INTO the conversation. Don't dump raw results —
113
+ synthesize: "I found 4 apps in this space — here's what they do well and
114
+ where they fall short."
115
+
116
+ **Exit condition (by depth):**
117
+ - Depth 1-2: At least 2 alternatives named (competitor, "do nothing," or
118
+ adjacent). User acknowledges the landscape.
119
+ - Depth 3: Direct competitors and at least 1 indirect alternative researched
120
+ with strengths/weaknesses. User acknowledges.
121
+ - Depth 4-5: Comprehensive landscape including direct, indirect, and emerging
122
+ threats. If multi-model dispatched, perspectives synthesized. User acknowledges.
123
+
124
+ ### Phase 3: Expand
125
+
126
+ Surface opportunities the user hasn't considered. Use the lightweight
127
+ expansion patterns from ideation-craft knowledge:
128
+
129
+ - Adjacent market: "Your users also need X — have you considered that?"
130
+ - Ecosystem play: "If you solve A, you're the natural place for B and C."
131
+ - Contrarian angle: "Everyone does X. What if you did the opposite?"
132
+ - Technology enabler: "A new capability makes Y possible — could that reshape
133
+ your approach?"
134
+ - AI-native rethinking: "If AI could handle Z, how would that change the product?"
135
+
136
+ **Depth scaling:**
137
+ - Depth 1-2: 1-2 expansion suggestions with brief rationale.
138
+ - Depth 3: 3-5 ideas with rationale.
139
+ - Depth 4-5: Full expansion pass leveraging Phase 2 research. Generate ideas
140
+ from existing data — no new searches in this phase.
141
+
142
+ Tag all expansion ideas as **preliminary** — the pipeline's `innovate-vision`
143
+ step does comprehensive strategic expansion later.
144
+
145
+ **Exit condition:** Present each expansion idea to the user. Each gets an
146
+ explicit disposition: **accept** (include in brief), **defer** (note as open
147
+ question), or **reject** (drop).
148
+
149
+ ### Phase 4: Challenge
150
+
151
+ Converge. Challenge every assumption surfaced in Phases 1-3, including
152
+ accepted expansion ideas.
153
+
154
+ **What to challenge:**
155
+ - **Feasibility**: Can this actually be built with the stated resources/timeline?
156
+ - **Scope**: Is this too broad? "If you could only ship ONE feature, what is it?"
157
+ - **Technical reality**: Are there hard technical constraints being glossed over?
158
+ - **Positioning**: "Three competitors already do this. What's your genuine
159
+ differentiator?"
160
+ - **Accepted expansions**: Phase 3 accepts are baseline intent. Phase 4 may
161
+ scope down or reject accepted ideas if they critically fail feasibility or
162
+ technical reality checks. Don't re-litigate the value of ideas the user
163
+ accepted — but DO challenge whether they're buildable and whether the overall
164
+ positioning holds against the competitive landscape.
165
+
166
+ **Exit condition:** Each challenged assumption is confirmed or revised by the
167
+ user. Core scope is explicitly locked — the user knows what's in and what's out.
168
+
169
+ ### Phase 5: Synthesize
170
+
171
+ Write `docs/spark-brief.md`. Create the `docs/` directory if it doesn't exist.
172
+
173
+ The brief is intentionally shallow — directional hypotheses, not validated
174
+ conclusions. Target 2-4 sentences or concise bullet points per section.
175
+ Sections may state "None identified" if inapplicable.
176
+
177
+ **At depth 1-3:** Present the brief to the user for final approval. Write the
178
+ file to disk after approval. This is the terminal phase — spark is complete.
179
+
180
+ **At depth 4+:** Generate the draft brief in conversation (not yet written to
181
+ disk). Present to the user for awareness: "Here's what I have before we
182
+ stress-test it." Then proceed to Phase 6.
183
+
184
+ Use the template in the Spark Brief Template section below.
185
+
186
+ ### Phase 6: Red-Team (depth 4+ only)
187
+
188
+ Send the draft spark brief to available external models as adversarial
189
+ reviewers.
190
+
191
+ **Depth 4:** Dispatch to 1 external model.
192
+ **Depth 5:** Dispatch to both Codex AND Gemini with reconciliation.
193
+
194
+ **Red-team prompt for external models:**
195
+
196
+ ```
197
+ You are an adversarial reviewer stress-testing a product idea brief.
198
+ Your job is to find weaknesses, challenge assumptions, and surface missed
199
+ opportunities.
200
+
201
+ SPARK BRIEF:
202
+ [Full content of the draft spark-brief.md]
203
+
204
+ CHALLENGE INSTRUCTIONS:
205
+ 1. For each section, identify the weakest assumption and explain why it might
206
+ be wrong.
207
+ 2. What competitors or market dynamics does the brief underestimate?
208
+ 3. What technical feasibility risks are glossed over?
209
+ 4. What user segments or use cases are missing?
210
+ 5. If you could only flag ONE critical risk, what would it be?
211
+
212
+ Be constructive but ruthless. Respond in structured markdown.
213
+ ```
214
+
215
+ **Execution:**
216
+
217
+ ```bash
218
+ # Codex
219
+ codex exec --skip-git-repo-check -s read-only --ephemeral "RED_TEAM_PROMPT" 2>&1
220
+
221
+ # Gemini
222
+ NO_BROWSER=true gemini -p "RED_TEAM_PROMPT" --output-format json --approval-mode yolo 2>/dev/null
223
+ ```
224
+
225
+ **If no external models available:** Fall back to primary model with distinct
226
+ "red team" system prompt. Use the three-perspective approach from
227
+ multi-model-research-dispatch knowledge (VC, competitor PM, skeptical user).
228
+
229
+ **Processing challenges:**
230
+ - Present each challenge to the user one at a time.
231
+ - For each: **accept** (update the brief), **dismiss** (explain why), or
232
+ **defer** (note as open question).
233
+ - Update the brief based on accepted challenges.
234
+
235
+ **Exit condition:** User reviews all red-team findings and gives final approval.
236
+ Write the updated brief to disk.
237
+
238
+ ### Adaptive Behavior
239
+
240
+ Assess these heuristics continuously, beginning in Phase 1. Use the idea's
241
+ characteristics to calibrate behavior across all phases:
242
+
243
+ - **Well-formed idea** → Phase 1 is brief. Move to research quickly.
244
+ - **Crowded space** → Phases 3 and 4 intensify. More expansion ideas to
245
+ differentiate, more competitive positioning challenges.
246
+ - **Novel idea (no competitors)** → Phase 2 shifts to adjacent-space and
247
+ analogous-system research. Phase 4 focuses on market-existence risk.
248
+
249
+ Phase transitions use natural conversational pivots, not mechanical
250
+ announcements. ("Now that I understand the core idea, let me research what
251
+ else is out there...")
252
+
253
+ ## Methodology Scaling
254
+
255
+ | Depth | Phase 1 (Seed) | Phase 2 (Research) | Phase 3 (Expand) | Phase 4 (Challenge) | Phase 5 (Synthesize) | Phase 6 (Red-Team) |
256
+ |-------|----------------|-------------------|-------------------|--------------------|--------------------|-------------------|
257
+ | 1 | 2-3 questions | Knowledge only, no search | 1 suggestion | Light — 1-2 key challenges | Brief, terminal | Skip |
258
+ | 2 | 2-3 questions | 1-2 quick searches + knowledge | 1-2 suggestions | Light challenge | Brief, terminal | Skip |
259
+ | 3 | 5-8 questions | 2-3 targeted searches | 3-5 ideas | Full challenge | Brief, terminal | Skip |
260
+ | 4 | 5-8 questions | Comprehensive + 1 external model | Full expansion | Full challenge | Draft, continue | 1 external model |
261
+ | 5 | 8-12 questions | Comprehensive + multi-model w/ reconciliation | Full expansion | Full challenge | Draft, continue | Multi-model + reconciliation |
262
+
263
+ **Presets:** mvp = Depth 1 | deep = Depth 5 | custom = user-specified
264
+
265
+ ## Spark Brief Template
266
+
267
+ When writing `docs/spark-brief.md`, use this exact structure:
268
+
269
+ ```markdown
270
+ <!-- scaffold:spark-brief v1 YYYY-MM-DD deep -->
271
+
272
+ # Spark Brief: [Idea Name]
273
+
274
+ > Generated by `scaffold run spark` — directional hypotheses, not validated
275
+ > conclusions. This document feeds into `create-vision` as a starting point,
276
+ > not a replacement.
277
+
278
+ ## Idea & Problem Space
279
+ [What the user wants to build, the core problem it solves, who it's for
280
+ and why they need it — from Phase 1]
281
+
282
+ ## Landscape
283
+ [Key competitors/alternatives with strengths/weaknesses, positioning,
284
+ market context — from Phase 2]
285
+
286
+ ## Expansion Ideas
287
+ [Accepted expansion ideas tagged as preliminary, deferred ideas noted —
288
+ from Phase 3]
289
+
290
+ ## Constraints & Scope
291
+ [Confirmed assumptions, scope boundaries, what's in and what's out,
292
+ locked decisions — from Phase 4]
293
+
294
+ ## Technology Opportunities
295
+ [Relevant tech enablers surfaced during research/expansion]
296
+
297
+ ## Open Questions
298
+ [Unresolved items flagged during conversation that need answers before building]
299
+
300
+ ## Risks
301
+ [Market, technical, and feasibility risks identified during challenge —
302
+ from Phase 4 and Phase 6 if red-teamed]
303
+
304
+ ## Session Metadata
305
+ - **Depth**: [1-5]
306
+ - **Red-teamed**: [yes/no]
307
+ - **Models consulted**: [list if multi-model, or "primary only"]
308
+ - **Date**: [YYYY-MM-DD]
309
+ ```
310
+
311
+ **Tracking comment format:** `<!-- scaffold:spark-brief v[N] YYYY-MM-DD [methodology] -->` where:
312
+ - `v[N]` increments on each update (v1, v2, v3...)
313
+ - `YYYY-MM-DD` is the session date
314
+ - `[methodology]` is the active methodology preset (e.g., `deep`, `mvp`, `custom`)
315
+
316
+ **Idea identity:** The idea name is captured in the `# Spark Brief: [Idea Name]` heading, not in the tracking comment. For identity matching, create-vision compares the heading's idea name against `$ARGUMENTS`.
317
+
318
+ ## How to Work With Me
319
+ - I'm your co-founder for the next few minutes. I'll challenge you AND do homework on your behalf.
320
+ - I'll ask hard questions. That's the point — weak assumptions caught now save months later.
321
+ - I'll research while we talk. When I find something relevant, I'll bring it into the conversation.
322
+ - Don't hold back on vague ideas. "Something with recipes" is a perfectly fine starting point.
323
+ - Tell me if I'm going down the wrong path. This is a conversation, not a lecture.
324
+
325
+ ## After This Step
326
+
327
+ When spark is complete, tell the user:
328
+
329
+ ---
330
+ **Spark complete** — `docs/spark-brief.md` created.
331
+
332
+ **Next:** Run `scaffold run create-vision` — the vision step will detect your
333
+ spark brief and use it as a starting point, accelerating the discovery process.
334
+
335
+ **Pipeline reference:** `scaffold run prompt-pipeline`
336
+
337
+ ---
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@zigrivers/scaffold",
3
- "version": "3.10.1",
3
+ "version": "3.11.0",
4
4
  "description": "AI-powered software project scaffolding pipeline",
5
5
  "type": "module",
6
6
  "workspaces": [