universal-dev-standards 5.15.1 → 5.16.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/bundled/ai/standards/acceptance-criteria-traceability.ai.yaml +31 -0
- package/bundled/ai/standards/forward-derivation-standards.ai.yaml +23 -0
- package/bundled/ai/standards/knowledge-graph-memory.ai.yaml +1 -1
- package/bundled/core/acceptance-criteria-traceability.md +46 -0
- package/bundled/core/forward-derivation-standards.md +19 -0
- package/bundled/core/knowledge-graph-memory.md +2 -2
- package/bundled/locales/zh-CN/CHANGELOG.md +13 -3
- package/bundled/locales/zh-CN/README.md +1 -1
- package/bundled/locales/zh-CN/core/acceptance-criteria-traceability.md +46 -0
- package/bundled/locales/zh-CN/core/forward-derivation-standards.md +19 -0
- package/bundled/locales/zh-CN/skills/ac-coverage/SKILL.md +194 -0
- package/bundled/locales/zh-CN/skills/adr-assistant/SKILL.md +135 -40
- package/bundled/locales/zh-CN/skills/brainstorm-assistant/SKILL.md +217 -63
- package/bundled/locales/zh-CN/skills/brainstorm-assistant/guide.md +599 -0
- package/bundled/locales/zh-CN/skills/commands/brainstorm.md +92 -25
- package/bundled/locales/zh-CN/skills/commit-standards/SKILL.md +78 -16
- package/bundled/locales/zh-CN/skills/contract-test-assistant/SKILL.md +85 -26
- package/bundled/locales/zh-CN/skills/deploy-assistant/SKILL.md +189 -0
- package/bundled/locales/zh-CN/skills/dev-methodology/SKILL.md +110 -0
- package/bundled/locales/zh-CN/skills/dev-methodology/guide.md +255 -0
- package/bundled/locales/zh-CN/skills/dev-workflow-guide/SKILL.md +70 -11
- package/bundled/locales/zh-CN/skills/journey-test-assistant/SKILL.md +209 -0
- package/bundled/locales/zh-CN/skills/knowledge-graph/SKILL.md +58 -0
- package/bundled/locales/zh-CN/skills/knowledge-graph/guide.md +74 -0
- package/bundled/locales/zh-CN/skills/migration-assistant/SKILL.md +125 -8
- package/bundled/locales/zh-CN/skills/observability-assistant/guide.md +188 -0
- package/bundled/locales/zh-CN/skills/orchestrate/SKILL.md +173 -0
- package/bundled/locales/zh-CN/skills/plan/SKILL.md +240 -0
- package/bundled/locales/zh-CN/skills/push/SKILL.md +242 -0
- package/bundled/locales/zh-CN/skills/retrospective-assistant/SKILL.md +104 -36
- package/bundled/locales/zh-CN/skills/reverse-engineer/SKILL.md +88 -32
- package/bundled/locales/zh-CN/skills/runbook-assistant/guide.md +216 -0
- package/bundled/locales/zh-CN/skills/skill-builder/SKILL.md +149 -0
- package/bundled/locales/zh-CN/skills/slo-assistant/guide.md +188 -0
- package/bundled/locales/zh-CN/skills/spec-derivation/SKILL.md +86 -0
- package/bundled/locales/zh-CN/skills/spec-derivation/guide.md +476 -0
- package/bundled/locales/zh-CN/skills/spec-driven-dev/SKILL.md +155 -81
- package/bundled/locales/zh-CN/skills/sweep/SKILL.md +151 -0
- package/bundled/locales/zh-CN/skills/testing-guide/SKILL.md +207 -110
- package/bundled/locales/zh-TW/CHANGELOG.md +13 -3
- package/bundled/locales/zh-TW/README.md +1 -1
- package/bundled/locales/zh-TW/core/acceptance-criteria-traceability.md +46 -0
- package/bundled/locales/zh-TW/core/browser-compatibility-standards.md +222 -5
- package/bundled/locales/zh-TW/core/contract-testing-standards.md +184 -5
- package/bundled/locales/zh-TW/core/cross-flow-regression.md +192 -5
- package/bundled/locales/zh-TW/core/forward-derivation-standards.md +19 -0
- package/bundled/locales/zh-TW/core/knowledge-graph-memory.md +2 -2
- package/bundled/locales/zh-TW/core/release-readiness-gate.md +186 -5
- package/bundled/locales/zh-TW/skills/adr-assistant/SKILL.md +21 -42
- package/bundled/locales/zh-TW/skills/brainstorm-assistant/SKILL.md +212 -59
- package/bundled/locales/zh-TW/skills/brainstorm-assistant/guide.md +266 -579
- package/bundled/locales/zh-TW/skills/commands/brainstorm.md +91 -26
- package/bundled/locales/zh-TW/skills/commit-standards/SKILL.md +77 -15
- package/bundled/locales/zh-TW/skills/contract-test-assistant/SKILL.md +75 -16
- package/bundled/locales/zh-TW/skills/dev-methodology/guide.md +255 -0
- package/bundled/locales/zh-TW/skills/dev-workflow-guide/SKILL.md +125 -64
- package/bundled/locales/zh-TW/skills/knowledge-graph/SKILL.md +5 -5
- package/bundled/locales/zh-TW/skills/knowledge-graph/guide.md +74 -0
- package/bundled/locales/zh-TW/skills/migration-assistant/SKILL.md +128 -11
- package/bundled/locales/zh-TW/skills/observability-assistant/guide.md +188 -0
- package/bundled/locales/zh-TW/skills/orchestrate/SKILL.md +3 -2
- package/bundled/locales/zh-TW/skills/plan/SKILL.md +3 -2
- package/bundled/locales/zh-TW/skills/push/SKILL.md +3 -2
- package/bundled/locales/zh-TW/skills/retrospective-assistant/SKILL.md +94 -28
- package/bundled/locales/zh-TW/skills/reverse-engineer/SKILL.md +84 -28
- package/bundled/locales/zh-TW/skills/runbook-assistant/guide.md +216 -0
- package/bundled/locales/zh-TW/skills/slo-assistant/guide.md +188 -0
- package/bundled/locales/zh-TW/skills/spec-derivation/guide.md +476 -0
- package/bundled/locales/zh-TW/skills/spec-driven-dev/SKILL.md +148 -77
- package/bundled/locales/zh-TW/skills/testing-guide/SKILL.md +141 -44
- package/bundled/skills/brainstorm-assistant/SKILL.md +142 -106
- package/bundled/skills/brainstorm-assistant/guide.md +256 -661
- package/bundled/skills/commands/brainstorm.md +51 -30
- package/bundled/skills/knowledge-graph/SKILL.md +5 -5
- package/bundled/skills/knowledge-graph/guide.md +4 -4
- package/package.json +2 -2
- package/src/commands/check.js +11 -2
- package/src/lint/i18n.js +109 -23
- package/standards-registry.json +4 -4
- package/bundled/locales/zh-TW/docs/SKILL-FALLBACK-GUIDE.md +0 -407
|
@@ -3,15 +3,15 @@ scope: universal
|
|
|
3
3
|
description: |
|
|
4
4
|
Guide structured AI-assisted brainstorming before specification writing.
|
|
5
5
|
Use when: vague ideas, feature exploration, problem reframing, creative ideation.
|
|
6
|
-
Keywords: brainstorm, ideation,
|
|
6
|
+
Keywords: brainstorm, ideation, persona ensemble, multi-critic, HMW, SCAMPER, 腦力激盪, 發想, 創意.
|
|
7
7
|
---
|
|
8
8
|
|
|
9
9
|
# Brainstorm Assistant Guide
|
|
10
10
|
|
|
11
11
|
> **Language**: English | [繁體中文](../../locales/zh-TW/skills/brainstorm-assistant/guide.md)
|
|
12
12
|
|
|
13
|
-
**Version**:
|
|
14
|
-
**Last Updated**: 2026-
|
|
13
|
+
**Version**: 3.0.0
|
|
14
|
+
**Last Updated**: 2026-06-01
|
|
15
15
|
**Applicability**: All software projects
|
|
16
16
|
**Scope**: universal
|
|
17
17
|
**Type**: Utility Skill (no core standard)
|
|
@@ -34,52 +34,42 @@ This skill fills the ideation gap in the UDS workflow:
|
|
|
34
34
|
(this) Existing Existing
|
|
35
35
|
```
|
|
36
36
|
|
|
37
|
-
|
|
38
|
-
|
|
39
|
-
## Research Foundations | 認知科學依據
|
|
40
|
-
|
|
41
|
-
The v2.0 workflow is grounded in three research findings that challenge
|
|
42
|
-
assumptions in Osborn's classic brainstorming rules:
|
|
43
|
-
|
|
44
|
-
v2.0 流程基於三項研究發現,這些發現挑戰了 Osborn 經典腦力激盪規則中的假設:
|
|
45
|
-
|
|
46
|
-
### 1. Independent thinking before merging (Nominal Group Technique)
|
|
37
|
+
### What v3 changes
|
|
47
38
|
|
|
48
|
-
|
|
49
|
-
outperform interacting groups in both quantity and quality. The mechanism is
|
|
50
|
-
**production blocking**: while listening to others (or reading AI output), your
|
|
51
|
-
own thought stream is interrupted.
|
|
39
|
+
v1 was a generic FRAME→DIVERGE→CONVERGE flow. v2 added cognitive-science gates (pre-flight anti-anchoring, a 10-idea gate, a single-AI rebuttal). **v3 re-centres the two work phases on the strongest findings in the 2024–2026 literature:**
|
|
52
40
|
|
|
53
|
-
**
|
|
54
|
-
|
|
41
|
+
- **DIVERGE** is now a **persona ensemble** (each role reasons via chain-of-thought, in isolation) crossed with **diversity lenses** — not a single AI voice racing to a count.
|
|
42
|
+
- **CONVERGE** is now a **multi-critic panel** plus a **hard-role rebuttal** (Devil's Advocate + Steelman) — not one AI scorer plus one soft critique.
|
|
55
43
|
|
|
56
|
-
|
|
57
|
-
防止 AI 先說話導致的錨定效應。
|
|
44
|
+
The pre-flight phase is **kept and strengthened** (fixation research says AI anchoring is real and possibly worse), while the 10-idea gate and the single-AI rebuttal are **demoted / hardened** because their original human-group evidence does not transfer cleanly to a single LLM.
|
|
58
45
|
|
|
59
|
-
|
|
46
|
+
---
|
|
60
47
|
|
|
61
|
-
|
|
62
|
-
are almost always the most familiar and obvious. Truly creative ideas emerge
|
|
63
|
-
after the "obvious answer zone" is exhausted — typically after idea 7 or 8.
|
|
48
|
+
## Research Foundations | 認知科學依據
|
|
64
49
|
|
|
65
|
-
|
|
66
|
-
Phase 2 force users past this threshold before evaluation begins.
|
|
50
|
+
v3 is grounded in six findings, each verified against its primary source. Author attributions below were cross-checked.
|
|
67
51
|
|
|
68
|
-
|
|
69
|
-
前突破「顯而易見答案區」。
|
|
52
|
+
v3 基於六項發現,每項都對照原始出處核對過(作者已校正)。
|
|
70
53
|
|
|
71
|
-
|
|
54
|
+
| # | Finding | Source |
|
|
55
|
+
|---|---------|--------|
|
|
56
|
+
| 1 | **Chain-of-thought + personas yields the highest idea diversity** of any prompting strategy, approaching human groups. | Meincke, Mollick & Terwiesch, *Prompting Diverse Ideas* (arXiv:2402.01727, 2024) |
|
|
57
|
+
| 2 | **Single-LLM ideation reduces idea diversity *across users*** even when each individual feels more creative. | Anderson, Shah & Kreminski, *Homogenization Effects of LLMs on Human Creative Ideation* (arXiv:2402.01536, 2024) |
|
|
58
|
+
| 3 | **Generative-AI output deepens design fixation and reduces divergent thinking** — fewer ideas, less variety. | Wadinambiarachchi, Kelly, Pareek, Zhou & Velloso, CHI 2024 (arXiv:2403.11164) |
|
|
59
|
+
| 4 | **A multi-agent "colleagues" system beats a single agent** on perceived outcome quality and novelty. | Quan, Albassam, Wu, Ding & Chin, *Towards AI as Colleagues* (arXiv:2510.23904, 2025) |
|
|
60
|
+
| 5 | **Associative / cross-domain prompting significantly increases originality.** | Mehrotra, Parab & Gulwani, *Enhancing Creativity in LLMs through Associative Thinking* (arXiv:2405.06715, 2024) |
|
|
61
|
+
| 6 | **LLMs are strong at idea generation and refinement but weak at scoping and multi-idea evaluation** (Hourglass framework). | Li, Padilla, Le Bras, Dong & Chantler, *A Review of LLM-Assisted Ideation* (arXiv:2503.00946, 2025) |
|
|
72
62
|
|
|
73
|
-
|
|
74
|
-
all judgment. Groups instructed to debate and criticise generated more ideas,
|
|
75
|
-
of higher quality, than groups following traditional "no criticism" brainstorming
|
|
76
|
-
rules. The mechanism: criticism forces explicit defence of assumptions, surfacing
|
|
77
|
-
hidden weaknesses before commitment.
|
|
63
|
+
> The widely-cited Doshi & Hauser, *Science Advances* (2024) — "generative AI enhances individual creativity but reduces the collective diversity of novel content" — corroborates finding #2. It is referenced as supporting evidence only; the homogenization guardrail anchors on the verified #2.
|
|
78
64
|
|
|
79
|
-
|
|
80
|
-
structured debate on the top-ranked ideas before final selection.
|
|
65
|
+
### How each finding maps into the flow
|
|
81
66
|
|
|
82
|
-
|
|
67
|
+
1. **#1 → DIVERGE default mechanism**: personas + chain-of-thought.
|
|
68
|
+
2. **#2 → Diversity-Collapse Guardrail**: don't seed with analogies; vary lenses not wording.
|
|
69
|
+
3. **#3 → PRE-FLIGHT kept/strengthened**: write your own ideas first; no "like X but for Y" seed.
|
|
70
|
+
4. **#4 → Enhanced Tier**: parallel persona/critic agents where the host supports them.
|
|
71
|
+
5. **#5 → Diversity lenses**: analogical, assumption-reversal, morphological.
|
|
72
|
+
6. **#6 → Multi-critic panel + human arbiter**: aggregate three critic lenses; the human decides.
|
|
83
73
|
|
|
84
74
|
---
|
|
85
75
|
|
|
@@ -88,22 +78,22 @@ structured debate on the top-ranked ideas before final selection.
|
|
|
88
78
|
### Workflow Overview
|
|
89
79
|
|
|
90
80
|
```
|
|
91
|
-
┌─────────────┐ ┌────────────┐
|
|
92
|
-
│ PRE-FLIGHT │─▶│ FRAME │─▶│
|
|
93
|
-
│ (Phase 0) │ │ Define the │ │
|
|
94
|
-
│ User writes │ │ problem │ │
|
|
95
|
-
│ 3 ideas │ │ │ │
|
|
96
|
-
└─────────────┘ └────────────┘
|
|
81
|
+
┌─────────────┐ ┌────────────┐ ┌──────────────────────┐ ┌────────────────────────┐ ┌────────────┐
|
|
82
|
+
│ PRE-FLIGHT │─▶│ FRAME │─▶│ DIVERGE │─▶│ CONVERGE │─▶│ OUTPUT │
|
|
83
|
+
│ (Phase 0) │ │ Define the │ │ Persona ensemble + │ │ Multi-critic panel + │ │ Brainstorm │
|
|
84
|
+
│ User writes │ │ problem │ │ diversity lenses │ │ hard-role rebuttal │ │ Report │
|
|
85
|
+
│ 3 ideas │ │ │ │ (CoT, branch-isolated)│ │ (DA + Steelman) │ │ │
|
|
86
|
+
└─────────────┘ └────────────┘ └──────────────────────┘ └────────────────────────┘ └────────────┘
|
|
97
87
|
```
|
|
98
88
|
|
|
99
89
|
### Phase Summary
|
|
100
90
|
|
|
101
|
-
| Phase | Goal | Key Mechanism | Time |
|
|
102
|
-
|
|
103
|
-
| **PRE-FLIGHT** | Prevent AI anchoring | User writes 3 ideas first | 3–5 min |
|
|
104
|
-
| **FRAME** | Define problem clearly | 5 Whys, HMW | 10–15 min |
|
|
105
|
-
| **DIVERGE** |
|
|
106
|
-
| **CONVERGE** | Select
|
|
91
|
+
| Phase | Goal | Key Mechanism (v3) | Time |
|
|
92
|
+
|-------|------|--------------------|------|
|
|
93
|
+
| **PRE-FLIGHT** | Prevent AI anchoring | User writes 3 ideas first; no analogy seed | 3–5 min |
|
|
94
|
+
| **FRAME** | Define problem clearly | 5 Whys, HMW, stakeholders | 10–15 min |
|
|
95
|
+
| **DIVERGE** | Force viewpoint diversity | Persona ensemble + diversity lenses | 15–25 min |
|
|
96
|
+
| **CONVERGE** | Select bias-checked ideas | Multi-critic panel + hard-role rebuttal | 15–20 min |
|
|
107
97
|
| **OUTPUT** | Actionable report | Brainstorm Report template | 5–10 min |
|
|
108
98
|
|
|
109
99
|
---
|
|
@@ -116,12 +106,9 @@ structured debate on the top-ranked ideas before final selection.
|
|
|
116
106
|
|
|
117
107
|
### Why this matters
|
|
118
108
|
|
|
119
|
-
|
|
120
|
-
sees any AI-generated framing, subsequent ideas cluster within that semantic
|
|
121
|
-
space. Pre-flight creates an "intellectual immune system" against this bias.
|
|
109
|
+
Research shows that once a person sees any AI-generated framing, subsequent ideas cluster within that semantic space. In AI-assisted contexts this is **stronger**, not weaker: design-fixation studies (Wadinambiarachchi et al., CHI 2024) find that fluent, high-fidelity AI output *deepens* fixation and reduces the variety and originality of subsequent ideas. Pre-flight creates an "intellectual immune system" against this bias.
|
|
122
110
|
|
|
123
|
-
|
|
124
|
-
空間內聚集。Pre-flight 為這種偏見創造了「智識免疫系統」。
|
|
111
|
+
研究顯示,一旦看到任何 AI 生成框架,後續想法就會在該語義空間內聚集。在 AI 情境下這**更強**:設計固著研究(Wadinambiarachchi 等,CHI 2024)發現流暢高擬真的 AI 輸出會*加深*固著、降低後續想法的多樣性與原創性。Pre-flight 為此偏見建立「智識免疫系統」。
|
|
125
112
|
|
|
126
113
|
### Prompt the user to provide
|
|
127
114
|
|
|
@@ -138,26 +125,20 @@ Before we start brainstorming, please take 2–3 minutes to write:
|
|
|
138
125
|
Submit when ready. The AI will read these before generating anything.
|
|
139
126
|
```
|
|
140
127
|
|
|
141
|
-
###
|
|
128
|
+
### Anti-seed guardrail (new in v3)
|
|
142
129
|
|
|
143
|
-
|
|
144
|
-
2. Proceed to FRAME
|
|
145
|
-
3. In DIVERGE Batch 1, explicitly explore directions the user did not mention
|
|
146
|
-
4. If the user declared an unwanted solution type, exclude that type from all
|
|
147
|
-
generated ideas throughout the session
|
|
130
|
+
Do **not** accept or generate a "like X but for Y" framing as the seed (e.g. "Slack but for doctors"). Analogical product seeds lock the LLM into one solution space and measurably reduce idea variety (this is the same homogenization mechanism as finding #2). Capture the underlying *problem*, not a product analogy. If the user offers such a seed, restate it as a problem ("teams of clinicians lose context between shifts") before proceeding.
|
|
148
131
|
|
|
149
|
-
###
|
|
132
|
+
### AI behaviour after receiving Pre-flight input
|
|
150
133
|
|
|
151
|
-
|
|
134
|
+
1. Acknowledge the user's ideas without evaluating them.
|
|
135
|
+
2. Proceed to FRAME.
|
|
136
|
+
3. In DIVERGE, explicitly explore directions the user did not mention.
|
|
137
|
+
4. If the user declared an unwanted solution type, exclude it from all generated ideas.
|
|
152
138
|
|
|
153
|
-
|
|
154
|
-
⚠ Skipping Pre-flight may cause AI anchoring
|
|
155
|
-
```
|
|
139
|
+
### Skipping Pre-flight
|
|
156
140
|
|
|
157
|
-
|
|
158
|
-
- The user has already written extensive notes elsewhere
|
|
159
|
-
- This is a repeat session on a well-understood problem
|
|
160
|
-
- Time is severely constrained (use `--quick` instead when possible)
|
|
141
|
+
Use `--skip-preflight` to bypass. A one-line warning is displayed: `⚠ Skipping Pre-flight may cause AI anchoring`. Appropriate when the user already has extensive notes, this is a repeat session, or time is severely constrained (prefer `--quick`).
|
|
161
142
|
|
|
162
143
|
---
|
|
163
144
|
|
|
@@ -171,26 +152,13 @@ The session continues immediately to FRAME. Pre-flight skip is appropriate when:
|
|
|
171
152
|
|
|
172
153
|
Ask "Why?" repeatedly to dig beneath surface-level problems.
|
|
173
154
|
|
|
174
|
-
**Template:**
|
|
175
|
-
|
|
176
155
|
```
|
|
177
156
|
Problem: [Initial problem statement]
|
|
178
|
-
|
|
179
|
-
Why
|
|
180
|
-
→ Because [reason
|
|
181
|
-
|
|
182
|
-
Why
|
|
183
|
-
→ Because [reason 2]
|
|
184
|
-
|
|
185
|
-
Why 3: Why does [reason 2] happen?
|
|
186
|
-
→ Because [reason 3]
|
|
187
|
-
|
|
188
|
-
Why 4: Why does [reason 3] happen?
|
|
189
|
-
→ Because [reason 4]
|
|
190
|
-
|
|
191
|
-
Why 5: Why does [reason 4] happen?
|
|
192
|
-
→ Because [root cause]
|
|
193
|
-
|
|
157
|
+
Why 1: Why does this problem exist? → Because [reason 1]
|
|
158
|
+
Why 2: Why does [reason 1] happen? → Because [reason 2]
|
|
159
|
+
Why 3: Why does [reason 2] happen? → Because [reason 3]
|
|
160
|
+
Why 4: Why does [reason 3] happen? → Because [reason 4]
|
|
161
|
+
Why 5: Why does [reason 4] happen? → Because [root cause]
|
|
194
162
|
Root Cause: [root cause]
|
|
195
163
|
```
|
|
196
164
|
|
|
@@ -198,41 +166,20 @@ Root Cause: [root cause]
|
|
|
198
166
|
|
|
199
167
|
```
|
|
200
168
|
Problem: Users abandon the checkout flow
|
|
201
|
-
|
|
202
|
-
Why
|
|
203
|
-
→ Because
|
|
204
|
-
|
|
205
|
-
Why
|
|
206
|
-
→ Because there are 5 separate pages
|
|
207
|
-
|
|
208
|
-
Why 3: Why are there 5 pages?
|
|
209
|
-
→ Because each validation step has its own page
|
|
210
|
-
|
|
211
|
-
Why 4: Why does each validation need a page?
|
|
212
|
-
→ Because the original design assumed slow connections
|
|
213
|
-
|
|
214
|
-
Why 5: Why does that assumption still hold?
|
|
215
|
-
→ It doesn't — most users are on broadband now
|
|
216
|
-
|
|
169
|
+
Why 1: → Because the process takes too long
|
|
170
|
+
Why 2: → Because there are 5 separate pages
|
|
171
|
+
Why 3: → Because each validation step has its own page
|
|
172
|
+
Why 4: → Because the original design assumed slow connections
|
|
173
|
+
Why 5: → It doesn't — most users are on broadband now
|
|
217
174
|
Root Cause: Outdated multi-page architecture designed for dial-up era
|
|
218
175
|
```
|
|
219
176
|
|
|
220
177
|
### Step 1.2: HMW — Problem Reframing
|
|
221
178
|
|
|
222
|
-
Transform the root cause into opportunity-focused questions.
|
|
223
|
-
|
|
224
|
-
**Format:** "How might we [verb] [desired outcome] for [stakeholder]?"
|
|
225
|
-
|
|
226
|
-
**Rules:**
|
|
227
|
-
- Broad enough to allow creative solutions
|
|
228
|
-
- Specific enough to be actionable
|
|
229
|
-
- Never include a solution in the question
|
|
230
|
-
|
|
231
|
-
**Example HMW Questions:**
|
|
179
|
+
Transform the root cause into opportunity-focused questions. **Format:** "How might we [verb] [desired outcome] for [stakeholder]?" Broad enough for creative solutions, specific enough to be actionable, and never containing a solution.
|
|
232
180
|
|
|
233
181
|
```
|
|
234
182
|
Root Cause: Outdated multi-page checkout architecture
|
|
235
|
-
|
|
236
183
|
HMW 1: How might we reduce checkout steps without losing validation?
|
|
237
184
|
HMW 2: How might we make the checkout feel instant?
|
|
238
185
|
HMW 3: How might we validate data without interrupting the user flow?
|
|
@@ -240,8 +187,6 @@ HMW 3: How might we validate data without interrupting the user flow?
|
|
|
240
187
|
|
|
241
188
|
### Step 1.3: Stakeholder Mapping
|
|
242
189
|
|
|
243
|
-
Identify who is affected and their needs.
|
|
244
|
-
|
|
245
190
|
| Stakeholder | Needs | Pain Points |
|
|
246
191
|
|-------------|-------|-------------|
|
|
247
192
|
| End users | Fast, simple checkout | Too many steps |
|
|
@@ -250,214 +195,137 @@ Identify who is affected and their needs.
|
|
|
250
195
|
|
|
251
196
|
### Step 1.4: Codebase Context (if applicable)
|
|
252
197
|
|
|
253
|
-
When brainstorming for an existing project, gather context:
|
|
254
|
-
|
|
255
|
-
- **Read** `README.md`, `package.json` for project overview
|
|
256
|
-
- **Grep** for related features, existing implementations
|
|
257
|
-
- **Glob** for relevant file structures
|
|
258
|
-
|
|
259
|
-
This grounds ideation in reality and prevents proposing ideas that conflict with existing architecture.
|
|
198
|
+
When brainstorming for an existing project, gather context: **Read** `README.md`/`package.json`; **Grep** for related features; **Glob** for relevant structures. This grounds ideation in reality and prevents proposing ideas that conflict with existing architecture.
|
|
260
199
|
|
|
261
200
|
---
|
|
262
201
|
|
|
263
|
-
## Phase 2: DIVERGE | 發散思考(
|
|
202
|
+
## Phase 2: DIVERGE | 發散思考(v3:persona 集成 + 多樣性透鏡)
|
|
264
203
|
|
|
265
|
-
> Goal:
|
|
204
|
+
> Goal: Force genuinely distinct viewpoints, not variations on one theme.
|
|
266
205
|
>
|
|
267
|
-
>
|
|
268
|
-
|
|
269
|
-
### The 10-Idea Gate
|
|
270
|
-
|
|
271
|
-
**Research basis:** Nijstad et al. show that the most creative ideas appear in
|
|
272
|
-
the second half of a divergence session. Stopping at 3–5 ideas almost always
|
|
273
|
-
means stopping in the "obvious answer zone."
|
|
206
|
+
> 目標:逼出真正不同的視角,而非同一主題的變體。
|
|
274
207
|
|
|
275
|
-
|
|
276
|
-
the status shows `Continue diverging (N/10)`.
|
|
208
|
+
### Why a persona ensemble (not a count gate)
|
|
277
209
|
|
|
278
|
-
|
|
210
|
+
Meincke, Mollick & Terwiesch (2024) found that **chain-of-thought + personas** produces the highest idea diversity of any prompting strategy tested, approaching human brainstorming groups. The old "generate ≥10 ideas" gate rested on Nijstad's "best ideas appear in the second half" — a **human-group** finding that is **not confirmed for LLMs**, which tend to plateau and recycle. So v3 makes the *structure* (distinct personas + lenses), not the *count*, the engine of diversity.
|
|
279
211
|
|
|
280
|
-
|
|
212
|
+
### Step 2a — Persona ensemble
|
|
281
213
|
|
|
282
|
-
|
|
283
|
-
- Speed over depth
|
|
284
|
-
- No idea is wrong
|
|
285
|
-
- Label the batch "Intuition Batch — fast, unfiltered"
|
|
286
|
-
- Display `✓ Intuition batch complete` after idea 5
|
|
214
|
+
Run a default ensemble. Each persona reasons **step by step (chain-of-thought)** and produces 2–4 ideas **from its own lens only**.
|
|
287
215
|
|
|
288
|
-
|
|
216
|
+
| Default persona | Lens it argues from |
|
|
217
|
+
|-----------------|---------------------|
|
|
218
|
+
| **Domain expert** | What does best-practice in this domain demand? |
|
|
219
|
+
| **Skeptic / risk** | Where does this break? What fails first? |
|
|
220
|
+
| **Cross-domain analogist** | How do biology / other fields solve an analogous problem? |
|
|
221
|
+
| **Cost / constraint** | What is the cheapest, smallest thing that works? |
|
|
222
|
+
| **End-user advocate** | What does the actual user feel and need? |
|
|
289
223
|
|
|
290
|
-
|
|
224
|
+
**Template per persona:**
|
|
291
225
|
|
|
292
|
-
Display before starting:
|
|
293
226
|
```
|
|
294
|
-
|
|
295
|
-
|
|
296
|
-
|
|
227
|
+
Persona: [name] — Lens: [one line]
|
|
228
|
+
Reasoning (step by step): [chain-of-thought]
|
|
229
|
+
Ideas (2–4, from this lens only):
|
|
230
|
+
1. [Idea] — [why this persona would propose it]
|
|
231
|
+
2. ...
|
|
297
232
|
```
|
|
298
233
|
|
|
299
|
-
|
|
300
|
-
- If a proposed idea shares a theme type with any Batch 1 idea, flag:
|
|
301
|
-
`⚠ Semantic overlap — try a different direction`
|
|
302
|
-
- The user may still submit the idea; the flag is advisory only
|
|
303
|
-
|
|
304
|
-
### Continuing past 10
|
|
305
|
-
|
|
306
|
-
Users may continue beyond 10 ideas without limit. No upper gate exists.
|
|
307
|
-
After 10, the "Enter CONVERGE" option appears alongside "Continue diverging".
|
|
234
|
+
Override with `--personas "designer,economist,skeptic,..."`. Six Thinking Hats map naturally onto personas (White=facts, Red=emotion, Black=risk, Yellow=benefit, Green=creativity, Blue=process).
|
|
308
235
|
|
|
309
|
-
###
|
|
236
|
+
### Branch isolation
|
|
310
237
|
|
|
311
|
-
|
|
312
|
-
|-----------|-------------|----------|
|
|
313
|
-
| **HMW Questions** | Default starting point | 預設起點 |
|
|
314
|
-
| **SCAMPER** | Improving existing features | 改善現有功能 |
|
|
315
|
-
| **Six Thinking Hats** | Need multiple perspectives | 需要多角度思考 |
|
|
316
|
-
|
|
317
|
-
#### Technique A: HMW Brainstorming (Default)
|
|
318
|
-
|
|
319
|
-
For each HMW question, generate 3–5 solution ideas.
|
|
320
|
-
|
|
321
|
-
**Template:**
|
|
322
|
-
|
|
323
|
-
```
|
|
324
|
-
HMW: How might we [question]?
|
|
325
|
-
|
|
326
|
-
Ideas:
|
|
327
|
-
1. [Idea] — [Brief explanation]
|
|
328
|
-
2. [Idea] — [Brief explanation]
|
|
329
|
-
3. [Idea] — [Brief explanation]
|
|
330
|
-
4. [Idea] — [Brief explanation]
|
|
331
|
-
5. [Idea] — [Brief explanation]
|
|
332
|
-
```
|
|
238
|
+
In **baseline** mode, generate each persona's ideas without showing it the other personas' output, then present all sets together only after every persona is done. This prevents intra-session anchoring — the same mechanism Pre-flight protects against, applied between personas. In the **Enhanced tier**, this isolation is physical: each persona is a separate agent with its own context (see Enhanced Tier).
|
|
333
239
|
|
|
334
|
-
|
|
240
|
+
### Step 2b — Diversity lenses
|
|
335
241
|
|
|
336
|
-
Apply
|
|
242
|
+
Apply at least one lens across the ensemble to push past the obvious zone. Connecting disparate concepts measurably increases originality (Mehrotra et al., 2024).
|
|
337
243
|
|
|
338
|
-
|
|
|
339
|
-
|
|
340
|
-
| **
|
|
341
|
-
| **
|
|
342
|
-
| **
|
|
343
|
-
| **M** | Modify | What can we enlarge, minimize, or change? | Minimize form fields to email-only |
|
|
344
|
-
| **P** | Put to other use | Can this serve a different purpose? | Use onboarding flow as feature tutorial |
|
|
345
|
-
| **E** | Eliminate | What can we remove entirely? | Eliminate email verification step |
|
|
346
|
-
| **R** | Reverse | What if we did the opposite? | Let users use first, register later |
|
|
244
|
+
| Lens | Prompt pattern | Best for |
|
|
245
|
+
|------|----------------|----------|
|
|
246
|
+
| **Analogical / cross-domain** | "Find a system in [biology / logistics / games] that solves an analogous problem. What principles can we borrow?" | Stuck in domain conventions |
|
|
247
|
+
| **Assumption reversal** | "List what everyone assumes must be true, then invert each one. Which inversion is interesting?" | 'Obvious' problem framings |
|
|
248
|
+
| **Morphological matrix** | "Build a 3-axis matrix (e.g. User × Trigger × Constraint); fill the rare/empty cells." | Systematic coverage |
|
|
347
249
|
|
|
348
|
-
|
|
250
|
+
At least one inverted assumption (reversal lens) should survive into the candidate set. Force a primary lens with `--lens analogical|reversal|morphological`.
|
|
349
251
|
|
|
350
|
-
|
|
351
|
-
Feature being improved: [feature name]
|
|
352
|
-
|
|
353
|
-
S - Substitute: [idea]
|
|
354
|
-
C - Combine: [idea]
|
|
355
|
-
A - Adapt: [idea]
|
|
356
|
-
M - Modify: [idea]
|
|
357
|
-
P - Put to use: [idea]
|
|
358
|
-
E - Eliminate: [idea]
|
|
359
|
-
R - Reverse: [idea]
|
|
360
|
-
```
|
|
252
|
+
### Step 2c — Continue nudge (auxiliary)
|
|
361
253
|
|
|
362
|
-
|
|
254
|
+
A raw count is a weak proxy in AI contexts. Use diversity, not count, as the gate: if the ensemble has covered fewer than ~8 distinct ideas **or** fewer than 3 distinct lenses, prompt: *"Continue — add a persona or a lens you haven't used yet."* There is no upper limit.
|
|
363
255
|
|
|
364
|
-
|
|
256
|
+
### Classic techniques (still available)
|
|
365
257
|
|
|
366
|
-
|
|
367
|
-
|-----|-------|-------|----------|
|
|
368
|
-
| 1 | White | Facts & Data | What do we know? What data do we have? |
|
|
369
|
-
| 2 | Red | Emotions & Intuition | What does our gut say? How do users feel? |
|
|
370
|
-
| 3 | Black | Risks & Caution | What could go wrong? What are the risks? |
|
|
371
|
-
| 4 | Yellow | Benefits & Optimism | What's the best case? What value does this add? |
|
|
372
|
-
| 5 | Green | Creativity | What new ideas emerge? What if we...? |
|
|
373
|
-
| 6 | Blue | Process & Summary | What's the big picture? What's our next step? |
|
|
258
|
+
HMW (default starting point), SCAMPER (improving an existing feature: Substitute, Combine, Adapt, Modify, Put-to-other-use, Eliminate, Reverse), and Six Thinking Hats remain available and compose well as personas.
|
|
374
259
|
|
|
375
260
|
---
|
|
376
261
|
|
|
377
|
-
## Phase 3: CONVERGE |
|
|
262
|
+
## Phase 3: CONVERGE | 收斂(v3:多評審面板 + 硬角色反駁)
|
|
378
263
|
|
|
379
|
-
> Goal: Select ideas that survive
|
|
264
|
+
> Goal: Select ideas that survive bias-reduced scoring AND structured debate — with the human as final arbiter.
|
|
380
265
|
>
|
|
381
|
-
>
|
|
266
|
+
> 目標:選出同時通過降偏誤評分與結構化辯論的想法——人類保留最終裁決權。
|
|
382
267
|
|
|
383
|
-
### Step 3a:
|
|
268
|
+
### Step 3a: Multi-critic panel
|
|
384
269
|
|
|
385
|
-
|
|
270
|
+
A single LLM is a weak, biased evaluator (Li et al., 2025: LLMs are strong at generation/refinement, weak at evaluation). v3 runs **three independent critics**, each scoring every idea 1–5 on its own lens; aggregate by mean.
|
|
386
271
|
|
|
387
|
-
|
|
|
388
|
-
|
|
389
|
-
| **
|
|
390
|
-
| **
|
|
391
|
-
| **
|
|
392
|
-
| **Alignment** | 20% | 5=core mission, 4=strategic, 3=relevant, 2=tangential, 1=off-mission |
|
|
272
|
+
| Critic lens | Weighted criteria it owns |
|
|
273
|
+
|-------------|---------------------------|
|
|
274
|
+
| **Engineering feasibility** | Feasibility 50% · Effort 50% |
|
|
275
|
+
| **User impact** | Impact 70% · Alignment 30% |
|
|
276
|
+
| **Strategic alignment** | Alignment 60% · Impact 40% |
|
|
393
277
|
|
|
394
|
-
**
|
|
278
|
+
**Per-criterion guide (1–5):**
|
|
395
279
|
|
|
396
|
-
|
|
397
|
-
|
|
398
|
-
|
|
280
|
+
| Criterion | 5 | 3 | 1 |
|
|
281
|
+
|-----------|---|---|---|
|
|
282
|
+
| Feasibility | trivial | moderate | near-impossible |
|
|
283
|
+
| Impact | transformative | moderate | negligible |
|
|
284
|
+
| Effort (inverted) | hours | weeks | quarters |
|
|
285
|
+
| Alignment | core mission | relevant | off-mission |
|
|
399
286
|
|
|
400
|
-
**
|
|
287
|
+
**Aggregation example:**
|
|
401
288
|
|
|
402
|
-
| # | Idea | Feasibility | Impact
|
|
403
|
-
|
|
404
|
-
| 1 | Single-page checkout | 4 | 5 |
|
|
405
|
-
| 2 | One-click buy | 3 |
|
|
406
|
-
| 3 | Progressive form | 5 | 4 | 4 |
|
|
407
|
-
| 4 | Guest checkout | 5 | 3 | 5 | 3 | **4.0** |
|
|
289
|
+
| # | Idea | Feasibility critic | Impact critic | Alignment critic | **Agg.** |
|
|
290
|
+
|---|------|--------------------|---------------|------------------|----------|
|
|
291
|
+
| 1 | Single-page checkout | 4.0 | 4.5 | 4.5 | **4.3** |
|
|
292
|
+
| 2 | One-click buy | 3.0 | 3.5 | 3.5 | **3.3** |
|
|
293
|
+
| 3 | Progressive form | 4.5 | 4.0 | 4.0 | **4.2** |
|
|
408
294
|
|
|
409
|
-
|
|
295
|
+
> **Optional — RICE / ICE (product features):** for prioritising shippable features, score `RICE = (Reach × Impact × Confidence) / Effort` or the lighter `ICE = Impact × Confidence × Ease`. Let engineers — not the LLM — estimate Effort (the LLM lacks codebase knowledge). RICE favours incremental wins; don't use it alone for strategic bets.
|
|
410
296
|
|
|
411
|
-
|
|
412
|
-
better ideas than groups following "no-criticism" rules. The mechanism is that
|
|
413
|
-
criticism forces explicit defence of assumptions, which surfaces hidden
|
|
414
|
-
weaknesses before commitment.
|
|
297
|
+
### Step 3b: Hard-role Rebuttal Round
|
|
415
298
|
|
|
416
|
-
|
|
299
|
+
A soft "please critique this" yields mostly agreement — LLMs are sycophantic under a weak critique frame. v3 assigns **hard roles** to the **top 3 ideas**:
|
|
417
300
|
|
|
418
|
-
|
|
419
|
-
|
|
420
|
-
"This idea will fail in [specific context] because [specific reason]."
|
|
421
|
-
```
|
|
301
|
+
- **Devil's Advocate**: "Your job is to argue this idea WILL fail. Produce 2 specific failure conditions."
|
|
302
|
+
- **Steelman**: "State the strongest, most charitable version of the counterargument — the one a thoughtful opponent would actually make."
|
|
422
303
|
|
|
423
|
-
|
|
424
|
-
- "This might be difficult to implement."
|
|
425
|
-
- "There could be edge cases."
|
|
304
|
+
Each counterargument must take the form: **"This idea will fail in [specific context] because [specific reason]."**
|
|
426
305
|
|
|
427
|
-
**
|
|
428
|
-
- "This idea will fail for enterprise customers because their IT policy
|
|
429
|
-
prohibits storing OAuth tokens in browser localStorage."
|
|
430
|
-
- "This idea will fail during peak traffic because the synchronous
|
|
431
|
-
API call blocks the render thread, causing visible jank at 500ms+."
|
|
306
|
+
**NOT acceptable** (too vague): "This might be difficult." / "There could be edge cases."
|
|
432
307
|
|
|
433
|
-
|
|
308
|
+
**Acceptable** (specific failure condition):
|
|
309
|
+
- "This will fail for enterprise customers because their IT policy prohibits storing OAuth tokens in browser localStorage."
|
|
310
|
+
- "This will fail during peak traffic because the synchronous API call blocks the render thread, causing visible jank at 500ms+."
|
|
434
311
|
|
|
435
|
-
The user MUST
|
|
312
|
+
The user MUST respond to each before advancing:
|
|
436
313
|
|
|
437
314
|
| Option | Action |
|
|
438
315
|
|--------|--------|
|
|
439
|
-
| (a) Accept
|
|
440
|
-
| (b) Disagree | Provide a specific reason
|
|
441
|
-
| (c)
|
|
442
|
-
|
|
443
|
-
### Rebuttal outcome in report
|
|
316
|
+
| (a) Accept | Provide a modified version that addresses the failure |
|
|
317
|
+
| (b) Disagree | Provide a specific reason the counterargument does not apply |
|
|
318
|
+
| (c) Valid | Remove the idea from the ranking |
|
|
444
319
|
|
|
445
|
-
Each idea that remains
|
|
320
|
+
Each idea that remains receives a badge: `✓ Passed rebuttal — [one-line summary of user's response]`.
|
|
446
321
|
|
|
447
|
-
|
|
448
|
-
✓ Passed rebuttal — [one-line summary of user's response]
|
|
449
|
-
```
|
|
450
|
-
|
|
451
|
-
**Skipping:** `--no-rebuttal` skips the rebuttal round. The report section is
|
|
452
|
-
marked "Rebuttal: skipped".
|
|
322
|
+
**Skipping:** `--no-rebuttal` skips this round; the report section is marked "Rebuttal: skipped".
|
|
453
323
|
|
|
454
324
|
---
|
|
455
325
|
|
|
456
326
|
## Phase 4: OUTPUT | 輸出提案
|
|
457
327
|
|
|
458
328
|
> Goal: Produce a structured report that feeds directly into `/requirement` or `/sdd`.
|
|
459
|
-
>
|
|
460
|
-
> 目標:產生可直接輸入 `/requirement` 或 `/sdd` 的結構化報告。
|
|
461
329
|
|
|
462
330
|
### Brainstorm Report Template
|
|
463
331
|
|
|
@@ -466,505 +334,231 @@ marked "Rebuttal: skipped".
|
|
|
466
334
|
|
|
467
335
|
**Date**: YYYY-MM-DD
|
|
468
336
|
**Participants**: [human, AI assistant]
|
|
469
|
-
**
|
|
470
|
-
**
|
|
471
|
-
**Rebuttal
|
|
337
|
+
**Personas Used**: [domain expert, skeptic, analogist, ...]
|
|
338
|
+
**Lenses Used**: [analogical, reversal, ...]
|
|
339
|
+
**Pre-flight**: [Completed / Skipped] **Rebuttal**: [Completed / Skipped] **Tier**: [Baseline / Enhanced]
|
|
472
340
|
|
|
473
341
|
## Problem Statement
|
|
474
|
-
|
|
475
|
-
[Refined problem statement from FRAME phase, including root cause from 5 Whys]
|
|
342
|
+
[Refined problem + root cause from 5 Whys]
|
|
476
343
|
|
|
477
344
|
## HMW Questions
|
|
478
|
-
|
|
479
345
|
1. How might we ...?
|
|
480
|
-
2. How might we ...?
|
|
481
|
-
3. How might we ...?
|
|
482
346
|
|
|
483
347
|
## Ideas Generated
|
|
484
|
-
|
|
485
|
-
|
|
486
|
-
|
|
487
|
-
| 1 | ... | B1 | SCAMPER-R | 4 | 5 | 3 | 5 | 4.3 |
|
|
488
|
-
| 2 | ... | B2 | HMW | 3 | 4 | 2 | 4 | 3.3 |
|
|
489
|
-
| 3 | ... | B2 | Six Hats-Green | 5 | 4 | 4 | 4 | 4.3 |
|
|
348
|
+
| # | Idea | Persona | Lens | Feas. critic | Impact critic | Align. critic | Agg. |
|
|
349
|
+
|---|------|---------|------|--------------|---------------|---------------|------|
|
|
350
|
+
| 1 | ... | Skeptic | Reversal | 4.0 | 4.5 | 4.0 | 4.2 |
|
|
490
351
|
|
|
491
352
|
## Top 3 Recommendations
|
|
353
|
+
### 1. [Idea] (Agg. X.X) ✓ Passed rebuttal
|
|
354
|
+
- **Why**: [Reasoning] - **Persona/Lens**: [..] - **Rebuttal response**: [one line] - **Scope**: [S/M/L]
|
|
492
355
|
|
|
493
|
-
|
|
494
|
-
|
|
495
|
-
- **Key Benefit**: [Primary value]
|
|
496
|
-
- **Rebuttal response**: [One-line summary of how user addressed the challenge]
|
|
497
|
-
- **Estimated Scope**: [Small / Medium / Large]
|
|
498
|
-
|
|
499
|
-
### 2. [Idea Name] (Score: X.X) ✓ Passed rebuttal
|
|
500
|
-
- **Why**: [Reasoning]
|
|
501
|
-
- **Key Benefit**: [Primary value]
|
|
502
|
-
- **Rebuttal response**: [One-line summary]
|
|
503
|
-
- **Estimated Scope**: [Small / Medium / Large]
|
|
504
|
-
|
|
505
|
-
### 3. [Idea Name] (Score: X.X) ✓ Passed rebuttal
|
|
506
|
-
- **Why**: [Reasoning]
|
|
507
|
-
- **Key Benefit**: [Primary value]
|
|
508
|
-
- **Rebuttal response**: [One-line summary]
|
|
509
|
-
- **Estimated Scope**: [Small / Medium / Large]
|
|
356
|
+
## Diversity Note
|
|
357
|
+
[How many distinct personas/lenses the surviving ideas span; flag if all from one cluster]
|
|
510
358
|
|
|
511
359
|
## Discarded Ideas (with reasons)
|
|
512
|
-
|
|
513
|
-
|
|
514
|
-
|
|
515
|
-
| ... | Removed during rebuttal round (counterargument accepted) |
|
|
516
|
-
| ... | Low feasibility (score: 1/5) |
|
|
360
|
+
| Idea | Reason |
|
|
361
|
+
|------|--------|
|
|
362
|
+
| ... | Removed during rebuttal (counterargument accepted) |
|
|
517
363
|
|
|
518
364
|
## Next Steps
|
|
519
|
-
|
|
520
365
|
- [ ] Proceed to `/requirement` with recommendation #1
|
|
521
366
|
- [ ] Proceed to `/sdd` if requirements are already clear
|
|
522
|
-
- [ ] Conduct follow-up brainstorm on [subtopic]
|
|
523
367
|
```
|
|
524
368
|
|
|
525
369
|
---
|
|
526
370
|
|
|
527
|
-
##
|
|
528
|
-
|
|
529
|
-
| Flag | Phase affected | Behaviour |
|
|
530
|
-
|------|---------------|-----------|
|
|
531
|
-
| `--skip-preflight` | Phase 0 | Bypass Pre-flight; display one-line anchoring warning |
|
|
532
|
-
| `--no-rebuttal` | Phase 3 | Skip rebuttal round; mark report section "Rebuttal: skipped" |
|
|
533
|
-
| `--quick` | All | 3-idea fast mode; Phase 0, 10-idea gate, and rebuttal all exempt |
|
|
534
|
-
| `--technique scamper` | Phase 2 | Force SCAMPER as primary divergence technique |
|
|
535
|
-
|
|
536
|
-
### Quick Mode (`--quick`)
|
|
537
|
-
|
|
538
|
-
Delivers results in under 5 minutes. Output is 20 lines maximum.
|
|
539
|
-
|
|
540
|
-
```
|
|
541
|
-
1 HMW question → 3 ideas → 1 recommendation → next steps
|
|
542
|
-
```
|
|
543
|
-
|
|
544
|
-
All cognitive-science gates (Pre-flight, 10-idea minimum, Rebuttal Round) are
|
|
545
|
-
exempt in quick mode. Quick mode is appropriate for:
|
|
546
|
-
- Mid-coding-session decisions
|
|
547
|
-
- Re-scoping an already-understood problem
|
|
548
|
-
- Initial orientation before a full session
|
|
549
|
-
|
|
550
|
-
Always offer to expand: "Would you like to run a full brainstorming session?"
|
|
551
|
-
|
|
552
|
-
---
|
|
553
|
-
|
|
554
|
-
## Integration with UDS Workflow
|
|
555
|
-
|
|
556
|
-
The Brainstorm Report maps directly to downstream tools:
|
|
557
|
-
|
|
558
|
-
### Mapping to `/requirement`
|
|
559
|
-
|
|
560
|
-
| Brainstorm Report Section | `/requirement` Field |
|
|
561
|
-
|---------------------------|---------------------|
|
|
562
|
-
| Problem Statement | User Story context |
|
|
563
|
-
| Top Recommendation | Feature description |
|
|
564
|
-
| HMW Questions | Acceptance Criteria seeds |
|
|
565
|
-
| Stakeholder Map | Stakeholder section |
|
|
566
|
-
| Discarded Ideas | Out of Scope |
|
|
567
|
-
|
|
568
|
-
### Mapping to `/sdd`
|
|
569
|
-
|
|
570
|
-
| Brainstorm Report Section | `/sdd` Field |
|
|
571
|
-
|---------------------------|-------------|
|
|
572
|
-
| Problem Statement | Summary / Motivation |
|
|
573
|
-
| Top Recommendation | Proposed Solution |
|
|
574
|
-
| Evaluation Matrix | Trade-offs / Alternatives Considered |
|
|
575
|
-
| Rebuttal responses | Risks section |
|
|
576
|
-
| Estimated Scope | Scope section |
|
|
577
|
-
|
|
578
|
-
---
|
|
579
|
-
|
|
580
|
-
## Configuration Detection
|
|
371
|
+
## Diversity-Collapse Guardrail
|
|
581
372
|
|
|
582
|
-
|
|
373
|
+
Using a single LLM for ideation reduces the **diversity of ideas across users**, even when each individual feels more creative (Anderson, Shah & Kreminski, 2024; corroborated by Doshi & Hauser, *Science Advances* 2024). Concrete guards:
|
|
583
374
|
|
|
584
|
-
|
|
585
|
-
|
|
586
|
-
3
|
|
587
|
-
|
|
375
|
+
- **Never seed** with a competitor or product analogy ("like X but for Y").
|
|
376
|
+
- **Vary the lens**, not just the wording — rewording a prompt does not diversify output.
|
|
377
|
+
- If the surviving Top 3 all originate from one persona or lens, **flag it** and run one additional lens before OUTPUT.
|
|
378
|
+
- Prefer **lower-fidelity** idea statements early (a rough direction, not a polished concept) — high-fidelity AI output deepens fixation (finding #3).
|
|
588
379
|
|
|
589
380
|
---
|
|
590
381
|
|
|
591
|
-
##
|
|
382
|
+
## Enhanced Tier — Parallel Personas
|
|
592
383
|
|
|
593
|
-
|
|
384
|
+
Multi-agent ideation — independent agents that contribute and converse — outperforms a single agent on perceived outcome quality and novelty (Quan et al., 2025, *MultiColleagues*). Where the host supports parallel subagents (e.g. Claude Code's Agent/Workflow tools), `--enhanced` realises the ensemble physically:
|
|
594
385
|
|
|
595
|
-
**
|
|
596
|
-
|
|
597
|
-
|
|
598
|
-
My ideas:
|
|
599
|
-
A: Send re-engagement emails after 7 days of inactivity
|
|
600
|
-
B: Add an achievement / gamification system
|
|
601
|
-
C: Show users a "what's new" summary on login
|
|
602
|
-
Do NOT want: Solutions requiring backend ML models (too slow to ship)
|
|
603
|
-
```
|
|
386
|
+
1. **Divergence**: each persona is a separate agent with **isolated context** (true branch isolation), run in parallel; results are merged and de-duplicated.
|
|
387
|
+
2. **Convergence**: the three critics run as parallel agents; the Devil's Advocate and Steelman are separate adversarial agents.
|
|
388
|
+
3. **Synthesis**: a final pass merges scores, flags diversity, and assembles the report.
|
|
604
389
|
|
|
605
|
-
|
|
606
|
-
```
|
|
607
|
-
5 Whys:
|
|
608
|
-
Problem: Retention at 15%
|
|
609
|
-
Why 1: Users stop using the app after initial signup → don't discover key features
|
|
610
|
-
Why 2: Why don't they discover features? → onboarding shows only basic setup
|
|
611
|
-
Why 3: Why basic setup only? → one-time wizard design assumption
|
|
612
|
-
Why 4: Why wizard? → team assumed users would explore independently
|
|
613
|
-
Why 5: Why does that assumption fail? → 20+ features, no progressive disclosure
|
|
614
|
-
|
|
615
|
-
Root Cause: No progressive onboarding — users see everything or nothing
|
|
616
|
-
|
|
617
|
-
HMW Questions:
|
|
618
|
-
1. How might we guide users to discover features at the right moment?
|
|
619
|
-
2. How might we make feature discovery feel natural, not forced?
|
|
620
|
-
3. How might we celebrate milestones to build engagement habits?
|
|
621
|
-
```
|
|
390
|
+
### Graceful degradation
|
|
622
391
|
|
|
623
|
-
**
|
|
624
|
-
```
|
|
625
|
-
AI note: Exploring directions beyond the user's A/B/C ideas
|
|
626
|
-
|
|
627
|
-
1. Contextual tooltips triggered by user behaviour [HMW-1]
|
|
628
|
-
2. Progressive checklist replacing one-time wizard [SCAMPER-S]
|
|
629
|
-
3. First-task onboarding (skip setup, do real work first) [SCAMPER-C]
|
|
630
|
-
4. Peer-mentor matching for new users [SCAMPER-R]
|
|
631
|
-
5. Habit streak tracker (daily login reward) [HMW-3]
|
|
632
|
-
✓ Intuition batch complete
|
|
633
|
-
```
|
|
634
|
-
|
|
635
|
-
**DIVERGE — Batch 2 (Extension — cross semantic boundary):**
|
|
636
|
-
```
|
|
637
|
-
Extension Batch: ideas must cross the semantic boundary of Batch 1
|
|
638
|
-
|
|
639
|
-
6. API-driven onboarding: detect user's data and auto-populate examples [HMW-1]
|
|
640
|
-
7. Reverse onboarding: show power-user workflow first, simplify on request [SCAMPER-R]
|
|
641
|
-
8. Social proof: show "your peers use feature X 3× per week" [HMW-2]
|
|
642
|
-
9. Feature unlock gates: earn access to advanced features via usage [HMW-3]
|
|
643
|
-
10. Cohort-based pacing: group users by signup week, send same tips together [HMW-1]
|
|
644
|
-
```
|
|
645
|
-
|
|
646
|
-
**CONVERGE — Scoring:**
|
|
647
|
-
```
|
|
648
|
-
| # | Idea | Feasibility | Impact | Effort | Align | Score |
|
|
649
|
-
|---|-------------------------------|-------------|--------|--------|-------|-------|
|
|
650
|
-
| 2 | Progressive checklist | 5 | 5 | 4 | 5 | 4.8 |
|
|
651
|
-
| 6 | API-driven onboarding | 4 | 5 | 3 | 5 | 4.3 |
|
|
652
|
-
| 1 | Contextual tooltips | 4 | 4 | 3 | 5 | 4.0 |
|
|
653
|
-
```
|
|
654
|
-
|
|
655
|
-
**CONVERGE — Rebuttal Round:**
|
|
656
|
-
```
|
|
657
|
-
Idea #2: Progressive checklist
|
|
658
|
-
|
|
659
|
-
Counterargument 1: "This idea will fail for returning users who re-install the
|
|
660
|
-
app because the checklist state is lost if tied to the device, causing
|
|
661
|
-
experienced users to re-do beginner tasks and feel patronised."
|
|
662
|
-
|
|
663
|
-
User response (b — disagree): "Checklist state is stored server-side tied to
|
|
664
|
-
user ID, so returning users resume from where they left off."
|
|
665
|
-
|
|
666
|
-
Counterargument 2: "This idea will fail for power users who feel checklists are
|
|
667
|
-
infantilising and will disable them immediately if there's no way to opt out."
|
|
668
|
-
|
|
669
|
-
User response (a — accept): Modified → Add a one-click 'I know this already'
|
|
670
|
-
dismiss on each checklist item, with a 'hide checklist' option in settings.
|
|
671
|
-
|
|
672
|
-
→ ✓ Passed rebuttal
|
|
673
|
-
```
|
|
674
|
-
|
|
675
|
-
**OUTPUT:** Top recommendation is "Progressive checklist (modified)" → proceed to `/requirement`.
|
|
392
|
+
This tier is **optional and host-dependent**. On assistants without subagent orchestration, `--enhanced` **silently falls back** to baseline (single-context simulated personas, sequential critics). The skill therefore remains `scope: universal` — every host gets the full methodology; only the execution substrate differs.
|
|
676
393
|
|
|
677
394
|
---
|
|
678
395
|
|
|
679
396
|
## Mode Selection Guide | 模式選擇指引
|
|
680
397
|
|
|
681
|
-
The
|
|
682
|
-
"which mode should I use?" decision overhead. This section explains the
|
|
683
|
-
rationale behind each rule.
|
|
684
|
-
|
|
685
|
-
SKILL.md 的模式選擇表使用客觀觸發條件,消除「我應該用哪個模式?」的決策負擔。本節說明各規則的設計理由。
|
|
686
|
-
|
|
687
|
-
### Why objective triggers instead of subjective diagnosis
|
|
688
|
-
|
|
689
|
-
The v2.0 brainstorm rebuttal session itself surfaced this problem: if users must
|
|
690
|
-
first diagnose "is my problem strategic or execution-type?", that meta-decision
|
|
691
|
-
consumes cognitive resources before the session even starts. Objective triggers
|
|
692
|
-
(word count, presence of a flag, existence of a spec) eliminate this.
|
|
693
|
-
|
|
694
|
-
### Trigger calibration
|
|
695
|
-
|
|
696
|
-
The `< 20 words` threshold is a starting heuristic, not a permanent rule.
|
|
697
|
-
After 5–10 sessions, review whether short inputs consistently led to
|
|
698
|
-
under-explored problems or whether they were legitimately simple. Adjust
|
|
699
|
-
the threshold based on observation, not intuition.
|
|
398
|
+
The Mode Selection table in SKILL.md uses objective triggers (word count, presence of a flag, existence of a spec) to remove the "which mode should I use?" decision overhead. If users must first diagnose "is my problem strategic or execution-type?", that meta-decision consumes resources before the session starts. The `< 20 words` threshold is a starting heuristic — review after 5–10 sessions and adjust based on observation, not intuition.
|
|
700
399
|
|
|
701
400
|
---
|
|
702
401
|
|
|
703
402
|
## Self-Evaluation Framework | 自我評估框架
|
|
704
403
|
|
|
705
|
-
|
|
706
|
-
empirical record of quality over time. Do NOT evaluate v2.0 vs v1.0 based
|
|
707
|
-
on a single session — draw conclusions only after collecting at least 3
|
|
708
|
-
comparable sessions.
|
|
709
|
-
|
|
710
|
-
每次腦力激盪結束後使用這三個指標,建立長期品質紀錄。不要以單次工作階段評估 v2.0 vs v1.0,至少收集 3 次可比較的工作階段後再下結論。
|
|
711
|
-
|
|
712
|
-
### The Three Metrics | 三個指標
|
|
404
|
+
Record three metrics after every session to build an empirical record. Do not judge v3 vs v2 on a single session — collect at least 3 comparable sessions.
|
|
713
405
|
|
|
714
|
-
|
|
715
|
-
**Question:** Of all ideas generated in this session, how many will you actually use or investigate further?
|
|
406
|
+
每次工作階段後記錄三個指標,建立實證紀錄。不要以單次評估 v3 vs v2,至少收集 3 次可比較的工作階段。
|
|
716
407
|
|
|
717
|
-
|
|
718
|
-
- 5 = 3+ ideas directly actionable
|
|
719
|
-
- 4 = 2 ideas actionable, 1+ worth exploring
|
|
720
|
-
- 3 = 1 idea actionable
|
|
721
|
-
- 2 = No idea directly actionable, but useful frames emerged
|
|
722
|
-
- 1 = Session produced nothing useful
|
|
408
|
+
### The Three Metrics
|
|
723
409
|
|
|
724
|
-
|
|
410
|
+
#### 1. Adoption Rate
|
|
411
|
+
**Question:** Of all ideas generated, how many will you actually use or investigate further?
|
|
412
|
+
- 5 = 3+ directly actionable · 3 = 1 actionable · 1 = nothing useful
|
|
725
413
|
|
|
726
|
-
#### 2. Diversity
|
|
727
|
-
**Question:**
|
|
414
|
+
#### 2. Diversity (v3 definition)
|
|
415
|
+
**Question:** Did the surviving ideas span multiple **personas and lenses**, or cluster in one?
|
|
416
|
+
- 5 = surviving ideas span 3+ personas/lenses · 3 = 2 · 1 = all from one persona/lens
|
|
728
417
|
|
|
729
|
-
|
|
730
|
-
- 5 = Extension Batch explored completely different problem dimensions
|
|
731
|
-
- 4 = Extension Batch had 3+ ideas that clearly crossed semantic boundaries
|
|
732
|
-
- 3 = Some extension, but most ideas were variations on Batch 1 themes
|
|
733
|
-
- 2 = Extension Batch was effectively a continuation of Batch 1
|
|
734
|
-
- 1 = No meaningful semantic difference between batches
|
|
418
|
+
This replaces the v2 "Extension Batch vs Intuition Batch" diversity question — v3 measures cross-persona/lens spread directly.
|
|
735
419
|
|
|
736
|
-
|
|
420
|
+
#### 3. Cognitive Load
|
|
421
|
+
**Question:** How mentally taxing was this session? (5 = effortless, 1 = exhausting)
|
|
737
422
|
|
|
738
|
-
|
|
739
|
-
**Question:** How mentally taxing was this session? (Higher score = lower burden)
|
|
423
|
+
A method that consistently scores 1–2 on cognitive load will be abandoned regardless of quality. Target: cognitive load ≥ 3 while adoption and diversity also improve.
|
|
740
424
|
|
|
741
|
-
|
|
742
|
-
- 5 = Session felt effortless and generative
|
|
743
|
-
- 4 = Some friction but overall productive
|
|
744
|
-
- 3 = Moderate effort, a few frustrating moments
|
|
745
|
-
- 2 = Session felt like work throughout
|
|
746
|
-
- 1 = Exhausting; would avoid repeating this format
|
|
747
|
-
|
|
748
|
-
**Why this matters:** A brainstorming method that consistently scores 1–2 on cognitive load will be abandoned in favour of informal thinking, regardless of quality improvements. Target: cognitive load ≥ 3 while adoption rate and diversity are also improving.
|
|
749
|
-
|
|
750
|
-
### Session Log Template | 工作階段記錄模板
|
|
425
|
+
### Session Log Template
|
|
751
426
|
|
|
752
427
|
```
|
|
753
428
|
Date: YYYY-MM-DD
|
|
754
429
|
Topic: [one sentence]
|
|
755
|
-
Mode: [Full / Quick / No-Rebuttal / Skip-Preflight]
|
|
430
|
+
Mode: [Full v3 / Enhanced / Quick / No-Rebuttal / Skip-Preflight]
|
|
431
|
+
Personas/Lenses used: [...]
|
|
756
432
|
Duration: [minutes]
|
|
757
433
|
|
|
758
434
|
Adoption Rate: /5 — [reason]
|
|
759
435
|
Diversity: /5 — [reason]
|
|
760
436
|
Cognitive Load: /5 — [reason]
|
|
761
437
|
|
|
762
|
-
Notable observation:
|
|
763
|
-
[One sentence on what worked or what felt wrong]
|
|
438
|
+
Notable observation: [one sentence]
|
|
764
439
|
```
|
|
765
440
|
|
|
766
|
-
### Interpreting Trends
|
|
767
|
-
|
|
768
|
-
After 3+ sessions, look for these patterns:
|
|
441
|
+
### Interpreting Trends
|
|
769
442
|
|
|
770
443
|
| Pattern | Interpretation | Action |
|
|
771
444
|
|---------|---------------|--------|
|
|
772
|
-
| Adoption
|
|
773
|
-
| Diversity
|
|
774
|
-
| Cognitive Load
|
|
775
|
-
| All three
|
|
776
|
-
| Adoption Rate ↑ but Cognitive Load ↓ over sessions | Habituation — the new flow is becoming natural | Continue; occasional `--quick` refreshes |
|
|
445
|
+
| Adoption ≤ 2 consistently | Problem framing failing in FRAME | Spend more time on 5 Whys |
|
|
446
|
+
| Diversity ≤ 2 consistently | Personas/lenses producing overlapping ideas | Add a more distant persona; force the reversal lens |
|
|
447
|
+
| Cognitive Load ≤ 2 consistently | Process overhead too high | Use `--no-rebuttal` or `--quick` for lower-stakes problems |
|
|
448
|
+
| All three ≥ 4 | v3 working well for this problem type | No change |
|
|
777
449
|
|
|
778
450
|
---
|
|
779
451
|
|
|
780
452
|
## A/B Experiment Protocol | A/B 實驗協議
|
|
781
453
|
|
|
782
|
-
|
|
783
|
-
your specific problem types, rather than relying on the research assumptions
|
|
784
|
-
alone.
|
|
454
|
+
Validate whether v3 outperforms v2 for *your* problem types rather than relying on the research alone.
|
|
785
455
|
|
|
786
|
-
|
|
456
|
+
驗證 v3 是否真的對你的問題類型優於 v2,而非單純依賴研究。
|
|
787
457
|
|
|
788
|
-
|
|
458
|
+
**Duration:** 3 paired sessions (minimum). **Method:** same category of problem, alternating method.
|
|
789
459
|
|
|
790
|
-
**Duration:** 3 paired sessions (minimum)
|
|
791
|
-
**Method:** Same category of problem, alternating method
|
|
792
|
-
|
|
793
|
-
**Session pairing:**
|
|
794
460
|
```
|
|
795
|
-
Session A1:
|
|
796
|
-
Session B1:
|
|
797
|
-
[one week gap]
|
|
798
|
-
Session A2: Problem of type Y → v2.0
|
|
799
|
-
Session B2: Problem of type Y → v1.0
|
|
800
|
-
[one week gap]
|
|
801
|
-
Session A3: Problem of type X → v2.0
|
|
802
|
-
Session B3: Problem of type X → v1.0
|
|
461
|
+
Session A1: type X → v2 (single AI, count gate, single rebuttal)
|
|
462
|
+
Session B1: type X → v3 (persona ensemble, multi-critic)
|
|
463
|
+
[one week gap] ... alternate to reduce order effects ...
|
|
803
464
|
```
|
|
804
465
|
|
|
805
|
-
|
|
806
|
-
|
|
807
|
-
**Critical: evaluate each session immediately after completion.** Do not wait — memory of cognitive load fades fastest.
|
|
808
|
-
|
|
809
|
-
### What to Measure
|
|
810
|
-
|
|
811
|
-
For each session, record:
|
|
812
|
-
|
|
813
|
-
| Measure | v1.0 session | v2.0 session |
|
|
814
|
-
|---------|-------------|-------------|
|
|
815
|
-
| Adoption Rate (1–5) | | |
|
|
816
|
-
| Diversity (1–5) | | |
|
|
817
|
-
| Cognitive Load (1–5) | | |
|
|
818
|
-
| Time to complete (min) | | |
|
|
819
|
-
| Ideas generated (count) | | |
|
|
820
|
-
| Ideas in Batch 2 that surprised you | N/A | |
|
|
821
|
-
|
|
822
|
-
### Interpreting Results
|
|
823
|
-
|
|
824
|
-
- **v2.0 wins** if Adoption Rate and Diversity are both higher, and Cognitive
|
|
825
|
-
Load difference is ≤ 1 point
|
|
826
|
-
- **v1.0 wins** if v2.0 Cognitive Load is ≥ 2 points lower *and* Adoption Rate
|
|
827
|
-
difference is < 1 point
|
|
828
|
-
- **Situational** if results differ by problem type → implement full situation
|
|
829
|
-
routing (see Mode Selection section)
|
|
830
|
-
|
|
831
|
-
### Key Hypothesis to Validate
|
|
466
|
+
**Measure per session:** Adoption (1–5), Diversity (1–5), Cognitive Load (1–5), Time, Ideas generated, "ideas from a persona/lens that surprised you".
|
|
832
467
|
|
|
833
|
-
|
|
834
|
-
|
|
835
|
-
|
|
468
|
+
**Interpretation:**
|
|
469
|
+
- **v3 wins** if Adoption and Diversity are both higher and the Cognitive-Load difference is ≤ 1 point.
|
|
470
|
+
- **v2 wins** if v3 Cognitive Load is ≥ 2 points lower *and* Adoption difference < 1 point.
|
|
471
|
+
- **Situational** if results differ by problem type → keep both and route by Mode Selection.
|
|
836
472
|
|
|
837
|
-
|
|
838
|
-
|
|
839
|
-
2. **
|
|
840
|
-
|
|
841
|
-
3. **Rebuttal hypothesis:** After the rebuttal round, did you actually modify
|
|
842
|
-
or discard any ideas? If never, the rebuttal round may not be adding value
|
|
843
|
-
for your problem types.
|
|
473
|
+
**Specific hypotheses to check:**
|
|
474
|
+
1. **Persona hypothesis:** do different personas actually produce non-overlapping ideas, or do they converge anyway?
|
|
475
|
+
2. **Lens hypothesis:** does the reversal/analogical lens surface ideas no persona reached?
|
|
476
|
+
3. **Multi-critic hypothesis:** do the three critics ever disagree materially, or do their scores collapse together (if always identical, one critic suffices)?
|
|
844
477
|
|
|
845
478
|
---
|
|
846
479
|
|
|
847
480
|
## Research Validity Caveats | 研究效度說明
|
|
848
481
|
|
|
849
|
-
|
|
850
|
-
external validity risk when applied to AI-assisted single-user brainstorming.
|
|
851
|
-
Understand the limitations before treating the research as settled fact.
|
|
482
|
+
Each finding has a different external-validity risk in AI-assisted, single-user contexts. Understand the limits before treating research as settled.
|
|
852
483
|
|
|
853
|
-
|
|
484
|
+
### v3 core mechanisms — risk: LOW–MEDIUM
|
|
854
485
|
|
|
855
|
-
|
|
486
|
+
**Persona ensemble + CoT (finding #1)** and **associative/cross-domain prompting (#5)** are about *prompting an LLM*, tested directly in LLM ideation studies — so they transfer well to this skill's exact use case. The main residual risk is that *simulated* personas in a single context may converge more than *isolated* agents; this is precisely why baseline uses branch isolation and the Enhanced tier exists. Validate with the "persona hypothesis" above.
|
|
856
487
|
|
|
857
|
-
**
|
|
858
|
-
separately then merging outperform interacting groups. Mechanism: production
|
|
859
|
-
blocking and conformity pressure in groups.
|
|
488
|
+
**Multi-critic panel (#6)** rests on LLMs being weak evaluators — well-supported — but the gain depends on the critics being genuinely independent. If your three critics always agree, you have one critic in three costumes; check the "multi-critic hypothesis".
|
|
860
489
|
|
|
861
|
-
|
|
862
|
-
same way a dominant human voice does in a group.
|
|
490
|
+
### Carried-over caveats — re-rated for v3
|
|
863
491
|
|
|
864
|
-
|
|
492
|
+
#### Pre-flight anti-anchoring — risk: LOW (was MEDIUM in v2)
|
|
865
493
|
|
|
866
|
-
The
|
|
867
|
-
conformity pressure and real-time interruption. AI output is static text you
|
|
868
|
-
can choose to ignore. The question is whether *reading* AI output before writing
|
|
869
|
-
your own ideas meaningfully narrows your semantic exploration.
|
|
494
|
+
v2 rated this MEDIUM, reasoning that static AI text is easier to ignore than a dominant human voice. The CHI 2024 fixation study (#3) now provides **direct** evidence that AI output deepens fixation in ideation. The mechanism transfers; pre-flight is kept and strengthened.
|
|
870
495
|
|
|
871
|
-
|
|
872
|
-
the AI's first batch was genuinely different from your Pre-flight ideas. If yes,
|
|
873
|
-
Pre-flight is working. If the AI frequently produces ideas similar to yours
|
|
874
|
-
anyway, Pre-flight's value may be in forcing you to articulate your starting
|
|
875
|
-
point — a different but still valid benefit.
|
|
496
|
+
#### Nijstad "best ideas in the second half" — risk: HIGH → mechanism demoted
|
|
876
497
|
|
|
877
|
-
|
|
498
|
+
This is a human-group finding (exhausting obvious associations first). LLMs more often **plateau and recycle** rather than improve late. v3 therefore demotes the fixed 10-idea count gate to an auxiliary nudge and makes structural diversity (personas + lenses) the real engine. Do not treat a high idea count as evidence of diversity.
|
|
878
499
|
|
|
879
|
-
|
|
880
|
-
sessions are the most conventional. Creative ideas emerge after the obvious
|
|
881
|
-
zone is exhausted.
|
|
500
|
+
#### Nemeth "debate beats no-criticism" — risk: HIGH → role hardened, not relied upon
|
|
882
501
|
|
|
883
|
-
|
|
884
|
-
this temporal pattern holds in AI-assisted contexts.
|
|
885
|
-
|
|
886
|
-
**External validity risk: LOW**
|
|
887
|
-
|
|
888
|
-
This finding is about cognitive pattern (exhausting obvious associations first),
|
|
889
|
-
not about group dynamics. It is more likely to transfer to AI-assisted solo
|
|
890
|
-
contexts because the underlying mechanism (semantic network traversal) applies
|
|
891
|
-
regardless of whether a human or AI is generating ideas.
|
|
892
|
-
|
|
893
|
-
**How to validate:** After each session, review ideas 1–5 vs ideas 7–10. Are
|
|
894
|
-
the later ideas consistently less obvious? If yes, the gate is earning its keep.
|
|
895
|
-
|
|
896
|
-
### Assumption 3: Debate produces better ideas (Nemeth basis)
|
|
897
|
-
|
|
898
|
-
**Original finding:** Nemeth (1995) shows that groups instructed to debate
|
|
899
|
-
produce more and better ideas than groups following "no criticism" rules.
|
|
900
|
-
|
|
901
|
-
**Application to v2.0:** The Rebuttal Round assumes that structured debate with
|
|
902
|
-
AI counterarguments improves idea quality.
|
|
903
|
-
|
|
904
|
-
**External validity risk: HIGH**
|
|
905
|
-
|
|
906
|
-
This is the most tenuous transfer. Nemeth's finding is about *genuine
|
|
907
|
-
disagreement between people with different perspectives and stakes*. An AI
|
|
908
|
-
playing "devil's advocate" on demand may produce counterarguments that are
|
|
909
|
-
formally correct but lack the authentic dissent that drives Nemeth's effect.
|
|
910
|
-
The counterarguments may feel adversarial without being genuinely revelatory.
|
|
911
|
-
|
|
912
|
-
**How to validate:** After each rebuttal round, honestly assess: did the
|
|
913
|
-
counterarguments surface something you had genuinely not considered? If the
|
|
914
|
-
answer is consistently "no, I had already thought of this", the rebuttal round
|
|
915
|
-
may be providing false confidence rather than genuine challenge. Consider
|
|
916
|
-
switching to `--no-rebuttal` for well-understood problem domains.
|
|
502
|
+
Nemeth's effect is about *genuine disagreement between people with stakes*. An AI on-demand devil's advocate can be formally correct yet not genuinely revelatory, and soft critique framings drift into sycophancy. v3 responds by **hardening the role** (explicit "argue this will fail" + Steelman) rather than assuming the effect transfers. Honestly assess after each round: did the counterargument surface something you had not considered? If consistently "no", use `--no-rebuttal` for well-understood domains.
|
|
917
503
|
|
|
918
504
|
---
|
|
919
505
|
|
|
920
506
|
## Gradual Adoption Protocol | 漸進採用協議
|
|
921
507
|
|
|
922
|
-
If
|
|
923
|
-
rather than all at once. Spend two weeks on each phase before adding the next.
|
|
508
|
+
If full v3 feels heavy, adopt in sequence — two weeks each.
|
|
924
509
|
|
|
925
|
-
|
|
510
|
+
1. **Persona ensemble only (weeks 1–2):** run `--no-rebuttal` with the default personas. Focus: do different personas produce non-overlapping ideas?
|
|
511
|
+
2. **Add diversity lenses (weeks 3–4):** force `--lens reversal` then `--lens analogical`. Focus: do lenses reach ideas personas alone did not?
|
|
512
|
+
3. **Add multi-critic + hard-role rebuttal (weeks 5–6):** run full v3. Focus: do critics disagree, and do counterarguments change your selection?
|
|
926
513
|
|
|
927
|
-
|
|
514
|
+
After 6 weeks, review session logs and A/B data to calibrate which combination fits your problem types.
|
|
928
515
|
|
|
929
|
-
|
|
930
|
-
rebuttal. Focus on: does writing 3 ideas first change what the AI produces?
|
|
516
|
+
---
|
|
931
517
|
|
|
932
|
-
|
|
518
|
+
## Best Practices
|
|
933
519
|
|
|
934
|
-
|
|
935
|
-
7–10 better than ideas 1–5?
|
|
520
|
+
### Do's
|
|
936
521
|
|
|
937
|
-
|
|
522
|
+
- Complete Pre-flight before starting; never seed with a product analogy.
|
|
523
|
+
- Run the full persona ensemble and at least one diversity lens — diversity comes from structure, not volume.
|
|
524
|
+
- Keep persona generation branch-isolated until every persona is done.
|
|
525
|
+
- Take the hard-role rebuttal seriously — vague defences are a warning sign.
|
|
526
|
+
- Let engineers estimate Effort/RICE, not the LLM.
|
|
527
|
+
- Record the three evaluation metrics; review trends every 3 sessions.
|
|
528
|
+
- Use `--enhanced` when the host supports it and diversity matters most.
|
|
938
529
|
|
|
939
|
-
|
|
940
|
-
had not considered?
|
|
530
|
+
### Don'ts
|
|
941
531
|
|
|
942
|
-
|
|
943
|
-
|
|
532
|
+
- Don't read AI output before writing your Pre-flight ideas.
|
|
533
|
+
- Don't accept a "like X but for Y" seed.
|
|
534
|
+
- Don't treat a high idea count as diversity — check persona/lens spread.
|
|
535
|
+
- Don't accept vague AI counterarguments — insist on specific failure conditions.
|
|
536
|
+
- Don't let a single LLM be the sole judge — aggregate critics; you decide.
|
|
537
|
+
- Don't draw conclusions from a single session — wait for 3+ data points.
|
|
944
538
|
|
|
945
539
|
---
|
|
946
540
|
|
|
947
|
-
##
|
|
541
|
+
## Integration with UDS Workflow
|
|
948
542
|
|
|
949
|
-
###
|
|
543
|
+
### Mapping to `/requirement`
|
|
950
544
|
|
|
951
|
-
|
|
952
|
-
|
|
953
|
-
|
|
954
|
-
|
|
955
|
-
|
|
956
|
-
|
|
957
|
-
|
|
545
|
+
| Brainstorm Report Section | `/requirement` Field |
|
|
546
|
+
|---------------------------|---------------------|
|
|
547
|
+
| Problem Statement | User Story context |
|
|
548
|
+
| Top Recommendation | Feature description |
|
|
549
|
+
| HMW Questions | Acceptance Criteria seeds |
|
|
550
|
+
| Stakeholder Map | Stakeholder section |
|
|
551
|
+
| Discarded Ideas | Out of Scope |
|
|
958
552
|
|
|
959
|
-
###
|
|
553
|
+
### Mapping to `/sdd`
|
|
960
554
|
|
|
961
|
-
|
|
962
|
-
|
|
963
|
-
|
|
964
|
-
|
|
965
|
-
|
|
966
|
-
|
|
967
|
-
|
|
555
|
+
| Brainstorm Report Section | `/sdd` Field |
|
|
556
|
+
|---------------------------|-------------|
|
|
557
|
+
| Problem Statement | Summary / Motivation |
|
|
558
|
+
| Top Recommendation | Proposed Solution |
|
|
559
|
+
| Multi-critic scores | Trade-offs / Alternatives Considered |
|
|
560
|
+
| Rebuttal responses | Risks section |
|
|
561
|
+
| Estimated Scope | Scope section |
|
|
968
562
|
|
|
969
563
|
---
|
|
970
564
|
|
|
@@ -974,7 +568,7 @@ which combination works best for your problem types.
|
|
|
974
568
|
|----------|-------------|
|
|
975
569
|
| [Requirement Engineering](../../core/requirement-engineering.md) | Brainstorm output feeds requirement writing |
|
|
976
570
|
| [Spec-Driven Development](../../core/spec-driven-development.md) | Brainstorm output feeds SDD proposals |
|
|
977
|
-
| [
|
|
571
|
+
| [Anti-Hallucination](../../core/anti-hallucination.md) | Critic feasibility claims must be evidence-based, not asserted |
|
|
978
572
|
|
|
979
573
|
---
|
|
980
574
|
|
|
@@ -982,8 +576,9 @@ which combination works best for your problem types.
|
|
|
982
576
|
|
|
983
577
|
| Version | Date | Changes |
|
|
984
578
|
|---------|------|---------|
|
|
985
|
-
|
|
|
986
|
-
| 2.
|
|
579
|
+
| 3.0.0 | 2026-06-01 | XSPEC-247: DIVERGE re-centred on persona ensemble + diversity lenses (analogical/reversal/morphological); CONVERGE re-centred on multi-critic panel + hard-role rebuttal (Devil's Advocate + Steelman); Diversity-Collapse Guardrail; Enhanced Tier (parallel persona/critic agents, graceful fallback); Research Foundations rebuilt on 6 verified 2024–2026 sources; Validity Caveats re-rated (pre-flight LOW, Nijstad/Nemeth demoted); new flags `--personas`/`--lens`/`--enhanced`; anti-seed guardrail |
|
|
580
|
+
| 2.1.0 | 2026-05-09 | XSPEC-196 Phase 2: Mode Selection objective routing; Self-Evaluation Framework; A/B Experiment Protocol; Research Validity Caveats; Gradual Adoption Protocol |
|
|
581
|
+
| 2.0.0 | 2026-05-09 | XSPEC-196: Phase 0 Pre-flight (anti-anchoring), Rebuttal Round, 10-idea minimum gate + semantic batching |
|
|
987
582
|
| 1.0.0 | 2026-02-12 | Initial release |
|
|
988
583
|
|
|
989
584
|
---
|