buildflow-dev 4.0.1 → 4.0.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,49 +1,197 @@
1
1
  ---
2
2
  name: buildflow-think
3
- description: Research and discuss with parallel Researcher agents
3
+ description: Deep research, architecture review, build-vs-buy reasoning, and engineering cognition
4
4
  allowed-tools: Read, Write, WebSearch
5
5
  agents: strategist, researcher, synthesizer
6
6
  ---
7
7
 
8
8
  # /buildflow-think
9
9
 
10
- Deep research and discussion mode. Spawns parallel Researcher agents then synthesizes findings.
10
+ Research, reasoning, and architecture review. Spawns parallel Researchers when web evidence is needed. Synthesizes conflicting information into a concrete recommendation with confidence score.
11
+
12
+ Goes beyond research — includes engineering cognition modes for architecture review, build-vs-buy analysis, technical debt assessment, and complexity budgeting.
11
13
 
12
14
  ## Usage
13
- - `/buildflow-think`open-ended discussion
14
- - `/buildflow-think <topic>` — research a specific topic
15
+ - `/buildflow-think <topic>` research a specific topic or technology
15
16
  - `/buildflow-think tech-stack` — compare technology options
16
- - `/buildflow-think risks` — identify project risks
17
-
18
- ## Step 1: Load Context
19
- Read `.buildflow/memory/light.md`, `.buildflow/core/vision.md`.
20
-
21
- ## Step 2: Clarify Research Goal
22
- Ask: "What do you want to explore or decide?"
23
- If already specified in the command, confirm understanding.
24
-
25
- ## Step 3: Parallel Research (if web research needed)
26
- Spawn up to 3 Researcher agents in parallel, each with:
27
- - A specific research question
28
- - Instructions to find 2-3 sources
29
- - Trust score (1-5) for each source
30
- - Key findings in bullet points
31
-
32
- ## Step 4: Synthesize
33
- Synthesizer agent combines findings:
34
- - Points of agreement across sources
35
- - Conflicting information (flag explicitly)
36
- - Recommendation with confidence (1-5)
37
- - Open questions remaining
38
-
39
- ## Step 5: Discussion
40
- Ask confidence check: "How confident are you in this direction? (1-5)"
41
- - 1-2: Explore alternatives
42
- - 3: Identify what would increase confidence
43
- - 4-5: Move forward, suggest next step
44
-
45
- ## Step 6: Save Insights
46
- Write to `.buildflow/research/[topic]-[date].md`
47
- Update `.buildflow/memory/light.md` with key decisions.
48
-
49
- ## Token Budget: ~30K (parallel research)
17
+ - `/buildflow-think risks` — surface technical and product risks
18
+ - `/buildflow-think --arch` — architecture review of current codebase or proposed design
19
+ - `/buildflow-think --build-vs-buy <capability>` — should we build it or use a library/service?
20
+ - `/buildflow-think --debt` — assess current technical debt and prioritize
21
+ - `/buildflow-think --complexity` — is the proposed plan too complex for the team/timeline?
22
+
23
+ ## Context Packet
24
+ - `.buildflow/core/vision.md`
25
+ - `.buildflow/memory/light.md` (app_name, framework, key decisions only)
26
+ - `.buildflow/codebase/MAP.md` (for --arch, --debt, --complexity modes)
27
+ - `.buildflow/specs/TDD.md` (for --arch mode, if exists)
28
+
29
+ ---
30
+
31
+ ## Standard Research Mode (default)
32
+
33
+ ### Step 1: Clarify Research Goal
34
+ If topic is specified, confirm understanding in one sentence.
35
+ If open-ended: "What are you trying to decide or understand?"
36
+
37
+ ### Step 2: Decompose into Research Questions
38
+ Break the topic into 2–3 specific, answerable sub-questions.
39
+ Assign one to each Researcher agent.
40
+
41
+ ### Step 3: Parallel Research
42
+ Spawn up to 3 Researcher agents simultaneously, each:
43
+ - Answering their specific sub-question
44
+ - Finding 2–3 sources
45
+ - Rating each source trust: 1 (blog opinion) → 5 (official docs / peer-reviewed)
46
+ - Summarizing key findings in bullet points
47
+
48
+ ### Step 4: Synthesize
49
+ Synthesizer combines all findings:
50
+ - **Consensus:** what all sources agree on
51
+ - **Conflicts:** where sources disagree — flag explicitly with each position
52
+ - **Gaps:** what the research didn't answer
53
+ - **Recommendation:** concrete, actionable, with confidence (1–5)
54
+ - **Risks:** what could go wrong with the recommendation
55
+
56
+ ### Step 5: Confidence Gate
57
+ If confidence < 3: "Low confidence. Here's what would increase it: [specific gaps to fill]"
58
+ If confidence ≥ 4: suggest next step (spec / plan / build)
59
+
60
+ ### Step 6: Save
61
+ Write `.buildflow/research/[topic]-[date].md`
62
+ Update `light.md` key decisions if a choice was made.
63
+
64
+ ---
65
+
66
+ ## Architecture Review Mode (`--arch`)
67
+
68
+ Triggered when: designing a new system, evaluating a proposed approach, or onboarding to a codebase.
69
+
70
+ ### Step 1: Load Architecture Context
71
+ Read `MAP.md`, `TDD.md` (if exists), `PATTERNS.md`.
72
+ If greenfield: work from vision + proposed TDD.
73
+
74
+ ### Step 2: Structural Analysis
75
+ Evaluate:
76
+ - **Separation of concerns** — do modules have single, clear responsibilities?
77
+ - **Coupling** — are modules tightly bound in ways that make changes expensive?
78
+ - **Cohesion** — does each module contain related things?
79
+ - **Boundaries** — are module boundaries enforced or leaky?
80
+ - **Scalability** — will this design hold under 10× the current load/data?
81
+
82
+ ### Step 3: Pattern Consistency
83
+ Does the proposed design follow existing patterns in the codebase?
84
+ If introducing new patterns: is there a good reason, or is it accidental inconsistency?
85
+
86
+ ### Step 4: Failure Mode Analysis
87
+ For each major component, ask: "What happens when this fails?"
88
+ - Does the failure cascade?
89
+ - Is there a recovery path?
90
+ - Will the user see a clear error or silent corruption?
91
+
92
+ ### Step 5: Engineering Smell Detection
93
+ Flag any of these if present:
94
+ - **God object** — one class/module doing too many things
95
+ - **Shotgun surgery** — a single logical change requires edits across many files
96
+ - **Primitive obsession** — using raw strings/numbers where domain types would be clearer
97
+ - **Anemic model** — data objects with no behavior, all logic in services
98
+ - **Circular dependency** — A imports B imports A
99
+ - **Distributed monolith** — microservices that can't deploy independently
100
+
101
+ ### Step 6: Architecture Report
102
+ ```
103
+ Architecture Review
104
+ ───────────────────
105
+ Strengths: [what's well-designed]
106
+ Concerns: [issues with severity: HIGH / MEDIUM / LOW]
107
+ Smells detected: [list or NONE]
108
+ Failure modes: [unhandled scenarios]
109
+ Recommendation: [concrete changes or "proceed as-is"]
110
+ Confidence: [1–5]
111
+ ```
112
+
113
+ ---
114
+
115
+ ## Build vs Buy Mode (`--build-vs-buy <capability>`)
116
+
117
+ Triggered when evaluating whether to implement a capability in-house or use an external library/service.
118
+
119
+ ### Step 1: Define the Capability
120
+ Exact scope: what does this need to do? What are the boundaries?
121
+
122
+ ### Step 2: Research Options
123
+ Parallel Researchers investigate:
124
+ - **Build** — what would implementation cost? What's the maintenance burden?
125
+ - **Buy (OSS)** — what libraries exist? License, maintenance status, community health?
126
+ - **Buy (SaaS)** — what services exist? Cost, reliability, vendor lock-in risk?
127
+
128
+ ### Step 3: Evaluation Matrix
129
+ | Factor | Build | OSS Library | SaaS |
130
+ |--------|-------|-------------|------|
131
+ | Time to working | [est] | [est] | [est] |
132
+ | Ongoing maintenance | high | low–med | none |
133
+ | Customization | full | partial | limited |
134
+ | Cost | dev time | free (usually) | $/mo |
135
+ | Vendor lock-in | none | low | HIGH |
136
+ | Compliance fit | full control | depends | verify |
137
+ | Team expertise needed | yes | some | low |
138
+
139
+ ### Step 4: Recommendation
140
+ Given project constraints (team size, timeline, compliance from PRD):
141
+ - **Recommend:** [build / OSS / SaaS]
142
+ - **Reason:** [top 2 factors that drove the decision]
143
+ - **Risk:** [biggest downside of this choice]
144
+ - **Confidence:** [1–5]
145
+
146
+ ---
147
+
148
+ ## Technical Debt Mode (`--debt`)
149
+
150
+ ### Step 1: Load Hotspots
151
+ Read `HOTSPOTS.md`. These are the known high-risk files.
152
+
153
+ ### Step 2: Debt Classification
154
+ For each hotspot and any other known issues:
155
+
156
+ | Item | Type | Impact | Cost to Fix | ROI |
157
+ |------|------|--------|------------|-----|
158
+ | [issue] | CODE / ARCH / TEST / INFRA | HIGH/MED/LOW | S/M/L/XL | high/med/low |
159
+
160
+ Debt types:
161
+ - **CODE** — complexity, duplication, poor naming
162
+ - **ARCH** — wrong abstraction, bad module boundary, circular dep
163
+ - **TEST** — missing or shallow test coverage on critical paths
164
+ - **INFRA** — outdated deps, missing CI, manual steps that should be automated
165
+
166
+ ### Step 3: Priority Recommendation
167
+ Sort by ROI (impact of fixing ÷ cost to fix). Top 3 items to address next.
168
+
169
+ ---
170
+
171
+ ## Complexity Budget Mode (`--complexity`)
172
+
173
+ Used before a plan is executed to ask: "Is this too much for the team/timeline?"
174
+
175
+ ### Step 1: Load Plan
176
+ Read `phases/[N]/PLAN.md`. Sum effort estimates.
177
+
178
+ ### Step 2: Complexity Assessment
179
+ ```
180
+ Complexity Budget Check
181
+ ───────────────────────
182
+ Total estimated effort: [sum]
183
+ XL tasks: [N] — [list them] ← each XL is a risk
184
+ External dependencies: [N] ← each is a coordination cost
185
+ New patterns introduced: [N] ← each needs learning time
186
+ Files touching hotspots: [N] ← each is higher risk
187
+
188
+ Verdict: FEASIBLE / RISKY / OVER-SCOPED
189
+ ```
190
+
191
+ - **FEASIBLE** — proceed
192
+ - **RISKY** — flag XL tasks for `/buildflow-think` before building, consider splitting
193
+ - **OVER-SCOPED** — recommend cutting scope, suggest which features to defer
194
+
195
+ ---
196
+
197
+ ## Token Budget: ~30K (standard) / ~35K (--arch or --build-vs-buy) / ~20K (--debt or --complexity)