ideabox 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/AGENTS.md +14 -0
- package/CLAUDE.md +14 -0
- package/LICENSE +21 -0
- package/README.md +413 -0
- package/bin/cli.mjs +267 -0
- package/package.json +39 -0
- package/skills/backlog/SKILL.md +101 -0
- package/skills/ideabox/SKILL.md +110 -0
- package/skills/ideabox/phases/01-research.md +173 -0
- package/skills/ideabox/phases/02-brainstorm.md +213 -0
- package/skills/ideabox/phases/03-plan.md +166 -0
- package/skills/ideabox/phases/04-build.md +213 -0
- package/skills/ideabox/phases/05-qa.md +135 -0
- package/skills/ideabox/phases/06-polish.md +111 -0
- package/skills/ideabox/phases/07-ship.md +119 -0
- package/skills/ideabox/phases/08-post-ship.md +83 -0
- package/skills/ideabox/phases/09-learn.md +208 -0
- package/skills/ideabox/references/research-sources.md +247 -0
- package/skills/ideabox/references/revenue-models.md +81 -0
- package/skills/ideabox/references/scoring-rubric.md +245 -0
- package/skills/ideabox/references/self-improvement.md +217 -0
- package/skills/profile/SKILL.md +97 -0
- package/skills/research/SKILL.md +62 -0
|
@@ -0,0 +1,245 @@
|
|
|
1
|
+
# Idea Scoring Rubric
|
|
2
|
+
|
|
3
|
+
## Hard Filter (must pass at least one)
|
|
4
|
+
|
|
5
|
+
### Monetization Gate
|
|
6
|
+
The idea has a clear path to revenue:
|
|
7
|
+
- SaaS subscription (recurring)
|
|
8
|
+
- One-time purchase / lifetime deal
|
|
9
|
+
- Freemium (free tier + paid features)
|
|
10
|
+
- Marketplace fees / commissions
|
|
11
|
+
- Sponsorware (open source + paid sponsors)
|
|
12
|
+
- Paid plugins / extensions
|
|
13
|
+
- API access / usage-based pricing
|
|
14
|
+
- Consulting / services built on the tool
|
|
15
|
+
|
|
16
|
+
### Open-Source Impact Gate
|
|
17
|
+
The idea would gain significant community traction:
|
|
18
|
+
- Solves a pain point many developers face daily
|
|
19
|
+
- Fills a gap in a popular ecosystem (npm, crates.io, MCP, etc.)
|
|
20
|
+
- High potential for GitHub stars (comparable projects have 1K+ stars)
|
|
21
|
+
- Evidence of community demand (forum posts, issues, comments asking for it)
|
|
22
|
+
|
|
23
|
+
**If an idea passes neither gate, discard it immediately.**
|
|
24
|
+
|
|
25
|
+
---
|
|
26
|
+
|
|
27
|
+
## Scoring Dimensions (each 1-10, max 60)
|
|
28
|
+
|
|
29
|
+
### 1. Revenue Potential
|
|
30
|
+
|
|
31
|
+
| Score | Criteria |
|
|
32
|
+
|-------|----------|
|
|
33
|
+
| 1-3 | No clear revenue model, or market too small |
|
|
34
|
+
| 4-5 | Possible freemium model, small market, low willingness to pay |
|
|
35
|
+
| 6-7 | Clear revenue model, proven willingness to pay in comparable products, $1K-5K MRR potential |
|
|
36
|
+
| 8-9 | Strong revenue model, large market, pricing precedent, $5K-20K MRR potential |
|
|
37
|
+
| 10 | Exceptional: large underserved market, high willingness to pay, $20K+ MRR potential |
|
|
38
|
+
|
|
39
|
+
### 2. Market Gap
|
|
40
|
+
|
|
41
|
+
| Score | Criteria |
|
|
42
|
+
|-------|----------|
|
|
43
|
+
| 1-3 | Many good solutions already exist |
|
|
44
|
+
| 4-5 | Solutions exist but are mediocre, expensive, or poorly maintained |
|
|
45
|
+
| 6-7 | 1-2 competitors but significant room for improvement |
|
|
46
|
+
| 8-9 | No direct competitor, or existing solutions miss key use case |
|
|
47
|
+
| 10 | Completely greenfield -- nobody has built this |
|
|
48
|
+
|
|
49
|
+
### 3. Demand Signal
|
|
50
|
+
|
|
51
|
+
| Score | Criteria |
|
|
52
|
+
|-------|----------|
|
|
53
|
+
| 1-3 | No evidence of demand (just seems like a good idea) |
|
|
54
|
+
| 4-5 | A few forum posts or tweets mentioning the problem |
|
|
55
|
+
| 6-7 | Multiple sources show demand: GitHub issues, Reddit threads, HN comments |
|
|
56
|
+
| 8-9 | Strong cross-platform demand: trending topics, many upvotes, recurring requests |
|
|
57
|
+
| 10 | Overwhelming demand: appears across 4+ sources, hundreds of upvotes/stars |
|
|
58
|
+
|
|
59
|
+
**Cross-source bonus:** If the same idea appears across 2+ independent sources:
|
|
60
|
+
- 2 sources: +2 to demand signal
|
|
61
|
+
- 3+ sources: +3 to demand signal (capped at 10)
|
|
62
|
+
|
|
63
|
+
### 4. Feasibility
|
|
64
|
+
|
|
65
|
+
| Score | Criteria |
|
|
66
|
+
|-------|----------|
|
|
67
|
+
| 1-3 | Requires large team, significant infrastructure, or domain expertise you lack |
|
|
68
|
+
| 4-5 | Buildable but challenging: 2-4 weeks for MVP, some unknowns |
|
|
69
|
+
| 6-7 | Achievable in 1-2 weeks with AI assistance, well-understood problem |
|
|
70
|
+
| 8-9 | Weekend MVP: clear scope, standard tech, straightforward implementation |
|
|
71
|
+
| 10 | Could build a working prototype in a single session |
|
|
72
|
+
|
|
73
|
+
### 5. Stack Fit
|
|
74
|
+
|
|
75
|
+
| Score | Criteria |
|
|
76
|
+
|-------|----------|
|
|
77
|
+
| 1-3 | Requires languages/frameworks you've never used |
|
|
78
|
+
| 4-5 | Requires learning a new stack but related to what you know |
|
|
79
|
+
| 6-7 | Mostly in your wheelhouse, one new tool to learn |
|
|
80
|
+
| 8-9 | Perfect match for your existing skills |
|
|
81
|
+
| 10 | You've built something very similar before |
|
|
82
|
+
|
|
83
|
+
**Note:** Stack fit is a bonus, never a blocker. A score of 3 doesn't disqualify an idea.
|
|
84
|
+
|
|
85
|
+
### 6. Trend Momentum
|
|
86
|
+
|
|
87
|
+
| Score | Criteria |
|
|
88
|
+
|-------|----------|
|
|
89
|
+
| 1-3 | Stable/declining space, no growth signals |
|
|
90
|
+
| 4-5 | Modest growth, some new entrants |
|
|
91
|
+
| 6-7 | Growing space with active investment and developer interest |
|
|
92
|
+
| 8-9 | Hot space: rapid growth, VC funding, many new tools launching |
|
|
93
|
+
| 10 | Explosive: defining category of the year, everyone talking about it |
|
|
94
|
+
|
|
95
|
+
**Agentic AI bonus:** Ideas in the agentic AI / MCP / AI tooling space get +2 to trend momentum (capped at 10).
|
|
96
|
+
|
|
97
|
+
---
|
|
98
|
+
|
|
99
|
+
## Total Score
|
|
100
|
+
|
|
101
|
+
Sum of all 6 dimensions. Maximum: 60.
|
|
102
|
+
|
|
103
|
+
| Range | Rating | Action |
|
|
104
|
+
|-------|--------|--------|
|
|
105
|
+
| 45-60 | Exceptional | Build this now |
|
|
106
|
+
| 35-44 | Strong | Worth serious consideration |
|
|
107
|
+
| 25-34 | Decent | Consider if it matches your interests |
|
|
108
|
+
| < 25 | Weak | Skip unless you have a personal reason |
|
|
109
|
+
|
|
110
|
+
---
|
|
111
|
+
|
|
112
|
+
## Output Format
|
|
113
|
+
|
|
114
|
+
### Step 1: Quick-Scan Summary Table
|
|
115
|
+
|
|
116
|
+
Present ALL ideas in a scannable table first, before detailed cards:
|
|
117
|
+
|
|
118
|
+
```
|
|
119
|
+
## Research Results: {N} ideas scored
|
|
120
|
+
|
|
121
|
+
| # | Idea | Score | Rating | Monetization | Complexity | Top Signal |
|
|
122
|
+
|----|------|-------|--------|-------------|------------|------------|
|
|
123
|
+
| 1 | MCP Testing Framework | 48/60 | Exceptional | Freemium $29/mo | 1-week | 47 upvotes on Reddit |
|
|
124
|
+
| 2 | AI Cost Monitor | 42/60 | Strong | SaaS $19/mo | Weekend | 3 HN threads |
|
|
125
|
+
| 3 | Context Engine CLI | 38/60 | Strong | Open source | 1-week | 12 GitHub issues |
|
|
126
|
+
| 4 | Agent Debugger | 35/60 | Strong | Freemium | Multi-week | YC batch trend |
|
|
127
|
+
| 5 | MCP Config Manager | 31/60 | Decent | One-time $29 | Weekend | MCP roadmap |
|
|
128
|
+
```
|
|
129
|
+
|
|
130
|
+
### Step 2: Visual Score Breakdown Per Idea
|
|
131
|
+
|
|
132
|
+
For each idea, present a detailed card with visual score bars:
|
|
133
|
+
|
|
134
|
+
```
|
|
135
|
+
---
|
|
136
|
+
|
|
137
|
+
### #1: MCP Testing Framework
|
|
138
|
+
|
|
139
|
+
> Developers cannot test MCP servers before production deployment. No testing tools exist in the ecosystem.
|
|
140
|
+
|
|
141
|
+
**Score: 48/60 -- Exceptional**
|
|
142
|
+
|
|
143
|
+
Revenue [=========-] 9/10 Strong willingness to pay (Postman comparison)
|
|
144
|
+
Gap [========--] 8/10 No direct competitor exists
|
|
145
|
+
Demand [=========-] 9/10 47 Reddit upvotes + 3 HN threads + 12 GitHub issues
|
|
146
|
+
Feasibility[========--] 8/10 Weekend MVP achievable
|
|
147
|
+
Stack Fit [======----] 6/10 TypeScript (your stack), new testing patterns
|
|
148
|
+
Trend [========--] 8/10 MCP ecosystem is exploding (+2 agentic bonus)
|
|
149
|
+
|
|
150
|
+
**Evidence chain:**
|
|
151
|
+
[Reddit] "How do you test MCP servers?" (47 upvotes, r/LocalLLaMA)
|
|
152
|
+
[HN] "Show HN: MCP server testing" (89 points)
|
|
153
|
+
[GitHub] 12 open issues on modelcontextprotocol/servers requesting testing tools
|
|
154
|
+
|
|
155
|
+
**Revenue model:** Freemium -- free CLI tool, $29/mo cloud dashboard with CI integration
|
|
156
|
+
Comparables: Postman ($12/mo), Insomnia (free), Hoppscotch (free)
|
|
157
|
+
|
|
158
|
+
**Tech stack:**
|
|
159
|
+
Best: TypeScript + Node.js test runner
|
|
160
|
+
Yours: TypeScript + Node.js (perfect match)
|
|
161
|
+
|
|
162
|
+
**Complexity:** 1-week MVP (core test runner + basic assertions)
|
|
163
|
+
|
|
164
|
+
---
|
|
165
|
+
```
|
|
166
|
+
|
|
167
|
+
### Score Bar Format
|
|
168
|
+
|
|
169
|
+
Use bracketed bars to visualize scores at a glance:
|
|
170
|
+
|
|
171
|
+
| Score | Visual |
|
|
172
|
+
|-------|--------|
|
|
173
|
+
| 10/10 | `[==========]` |
|
|
174
|
+
| 9/10 | `[=========-]` |
|
|
175
|
+
| 8/10 | `[========--]` |
|
|
176
|
+
| 7/10 | `[=======---]` |
|
|
177
|
+
| 6/10 | `[======----]` |
|
|
178
|
+
| 5/10 | `[=====-----]` |
|
|
179
|
+
| 4/10 | `[====------]` |
|
|
180
|
+
| 3/10 | `[===-------]` |
|
|
181
|
+
| 2/10 | `[==--------]` |
|
|
182
|
+
| 1/10 | `[=---------]` |
|
|
183
|
+
|
|
184
|
+
### Step 3: Action Prompt
|
|
185
|
+
|
|
186
|
+
After all cards:
|
|
187
|
+
|
|
188
|
+
```
|
|
189
|
+
Pick an idea: type a number (1-5) to brainstorm and build
|
|
190
|
+
save N -- save to backlog for later
|
|
191
|
+
dismiss N -- not interested (won't suggest similar)
|
|
192
|
+
compare N M -- side-by-side comparison
|
|
193
|
+
more -- show more ideas
|
|
194
|
+
done -- finish browsing
|
|
195
|
+
```
|
|
196
|
+
|
|
197
|
+
### Rating Labels
|
|
198
|
+
|
|
199
|
+
Use these labels consistently:
|
|
200
|
+
- 45-60: **Exceptional** -- strong evidence, clear path
|
|
201
|
+
- 35-44: **Strong** -- worth serious consideration
|
|
202
|
+
- 25-34: **Decent** -- matches interests
|
|
203
|
+
- < 25: not shown (filtered out)
|
|
204
|
+
|
|
205
|
+
### Evidence Chain Rules
|
|
206
|
+
|
|
207
|
+
- Lead with the strongest evidence (highest upvotes/stars)
|
|
208
|
+
- Always include the source platform in brackets: `[Reddit]`, `[HN]`, `[GitHub]`, `[npm]`
|
|
209
|
+
- Include the specific metric: upvotes, points, stars, issue count
|
|
210
|
+
- Link to the actual URL when available
|
|
211
|
+
- Cross-source evidence gets a callout: "Appeared across 3 sources"
|
|
212
|
+
|
|
213
|
+
### Comparison Format
|
|
214
|
+
|
|
215
|
+
When user types `compare N M`:
|
|
216
|
+
|
|
217
|
+
```
|
|
218
|
+
## Head-to-Head: #1 vs #3
|
|
219
|
+
|
|
220
|
+
| | #1: MCP Testing | #3: Context Engine |
|
|
221
|
+
|----------------|--------------------|--------------------|
|
|
222
|
+
| Score | **48/60** | 38/60 |
|
|
223
|
+
| Rating | Exceptional | Strong |
|
|
224
|
+
| Revenue | [=========-] 9 | [======----] 6 |
|
|
225
|
+
| Gap | [========--] 8 | [=======---] 7 |
|
|
226
|
+
| Demand | [=========-] 9 | [======----] 6 |
|
|
227
|
+
| Feasibility | [========--] 8 | [========--] 8 |
|
|
228
|
+
| Stack Fit | [======----] 6 | [=========-] 9 |
|
|
229
|
+
| Trend | [========--] 8 | [==--------] 2 |
|
|
230
|
+
| Monetization | Freemium $29/mo | Open source |
|
|
231
|
+
| Complexity | 1-week | 1-week |
|
|
232
|
+
| Top Evidence | 47 Reddit upvotes | 12 GitHub issues |
|
|
233
|
+
|
|
234
|
+
**Recommendation:** #1 (MCP Testing) -- higher demand evidence, clearer monetization,
|
|
235
|
+
stronger trend momentum. #3 has better stack fit but weaker market signals.
|
|
236
|
+
|
|
237
|
+
Build which one? (1 / 3 / neither)
|
|
238
|
+
```
|
|
239
|
+
|
|
240
|
+
## Ranking
|
|
241
|
+
|
|
242
|
+
Present ideas sorted by total score (highest first). If tied, prefer:
|
|
243
|
+
1. Higher demand signal (proven market)
|
|
244
|
+
2. Higher feasibility (faster to ship)
|
|
245
|
+
3. Higher revenue potential (monetizable)
|
|
@@ -0,0 +1,217 @@
|
|
|
1
|
+
# Self-Improvement Engine
|
|
2
|
+
|
|
3
|
+
IdeaBox automatically improves its own research, scoring, and query strategies based on outcomes. This file defines the three self-improvement loops.
|
|
4
|
+
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
## 1. Source Quality Tracking
|
|
8
|
+
|
|
9
|
+
Track which research sources contribute to ideas that actually get built.
|
|
10
|
+
|
|
11
|
+
### Data: `~/.ideabox/source-quality.jsonl`
|
|
12
|
+
|
|
13
|
+
Append after each session:
|
|
14
|
+
```json
|
|
15
|
+
{"ts":"{ISO}","session_id":"...","source":"agentic_ai","signals_found":5,"contributed_to_chosen":true,"idea_outcome":"completed"}
|
|
16
|
+
{"ts":"{ISO}","session_id":"...","source":"pain_points","signals_found":8,"contributed_to_chosen":false,"idea_outcome":null}
|
|
17
|
+
```
|
|
18
|
+
|
|
19
|
+
### Source Score Formula
|
|
20
|
+
|
|
21
|
+
For each source, compute a quality score:
|
|
22
|
+
|
|
23
|
+
```
|
|
24
|
+
contribution_rate = ideas_contributed_to_chosen / total_sessions_with_source
|
|
25
|
+
outcome_weight = completed: 1.0, started: 0.6, planned: 0.3, dismissed: 0.0
|
|
26
|
+
weighted_score = sum(contribution_rate * outcome_weight) / sessions_count
|
|
27
|
+
|
|
28
|
+
Final source_score = 0.3 * contribution_rate + 0.4 * weighted_score + 0.3 * avg_signals_per_session
|
|
29
|
+
Clamp to [0.1, 1.0]
|
|
30
|
+
```
|
|
31
|
+
|
|
32
|
+
### How Sources Adapt
|
|
33
|
+
|
|
34
|
+
During the research phase:
|
|
35
|
+
1. Read `~/.ideabox/source-quality.jsonl`
|
|
36
|
+
2. Compute source scores
|
|
37
|
+
3. Sources scoring > 0.7: allocate more search queries (expand coverage)
|
|
38
|
+
4. Sources scoring 0.4-0.7: keep as-is
|
|
39
|
+
5. Sources scoring < 0.4 for 5+ sessions: reduce to 1 query (don't eliminate — it may recover)
|
|
40
|
+
6. Present source quality in the Research Coverage stats block:
|
|
41
|
+
|
|
42
|
+
```
|
|
43
|
+
| Source | Quality | Trend | Sessions |
|
|
44
|
+
|--------|---------|-------|----------|
|
|
45
|
+
| Agentic AI | 0.85 (high) | ↑ improving | 12 |
|
|
46
|
+
| Pain Points | 0.62 (medium) | → stable | 12 |
|
|
47
|
+
| Trending | 0.38 (low) | ↓ declining | 10 |
|
|
48
|
+
```
|
|
49
|
+
|
|
50
|
+
### Bootstrap (cold start)
|
|
51
|
+
|
|
52
|
+
If fewer than 5 sessions exist, use equal weights for all sources. Quality tracking activates after 5+ sessions.
|
|
53
|
+
|
|
54
|
+
---
|
|
55
|
+
|
|
56
|
+
## 2. Scoring Weight Adaptation
|
|
57
|
+
|
|
58
|
+
Track which scoring dimensions best predict ideas that actually ship.
|
|
59
|
+
|
|
60
|
+
### Data: `~/.ideabox/scoring-feedback.jsonl`
|
|
61
|
+
|
|
62
|
+
Append when an idea reaches a terminal state (completed, abandoned, or dismissed after planning):
|
|
63
|
+
```json
|
|
64
|
+
{
|
|
65
|
+
"ts": "{ISO}",
|
|
66
|
+
"idea_id": "...",
|
|
67
|
+
"outcome": "completed",
|
|
68
|
+
"original_scores": {"revenue": 7, "gap": 9, "demand": 8, "feasibility": 8, "stack_fit": 9, "trend": 9},
|
|
69
|
+
"total_score": 50,
|
|
70
|
+
"phases_completed": 8
|
|
71
|
+
}
|
|
72
|
+
```
|
|
73
|
+
|
|
74
|
+
### Weight Adaptation Formula
|
|
75
|
+
|
|
76
|
+
After 10+ scored outcomes, compute correlation between each dimension and successful outcomes:
|
|
77
|
+
|
|
78
|
+
```
|
|
79
|
+
For each dimension:
|
|
80
|
+
completed_avg = average score for ideas with outcome=completed
|
|
81
|
+
abandoned_avg = average score for ideas with outcome=abandoned/dismissed
|
|
82
|
+
predictive_power = completed_avg - abandoned_avg
|
|
83
|
+
|
|
84
|
+
Normalize predictive_power across dimensions so they sum to 6.0 (preserving max 60 total).
|
|
85
|
+
```
|
|
86
|
+
|
|
87
|
+
**Example adaptation:**
|
|
88
|
+
- If `feasibility` strongly predicts completion (high for completed, low for abandoned): increase its weight
|
|
89
|
+
- If `stack_fit` doesn't predict outcomes (similar scores for both): decrease its weight
|
|
90
|
+
- Never let any dimension drop below 0.5 or exceed 1.5 (prevent over-fitting)
|
|
91
|
+
|
|
92
|
+
### How Weights Apply
|
|
93
|
+
|
|
94
|
+
In the research phase scoring step:
|
|
95
|
+
1. Read `~/.ideabox/scoring-feedback.jsonl`
|
|
96
|
+
2. If 10+ outcomes exist, compute adapted weights
|
|
97
|
+
3. Replace default equal weights (1.0 each) with adapted weights
|
|
98
|
+
4. Present the adapted weights in the scoring output:
|
|
99
|
+
|
|
100
|
+
```
|
|
101
|
+
## Scoring Weights (adapted from 15 past outcomes)
|
|
102
|
+
Revenue: 1.2x | Gap: 0.8x | Demand: 1.3x | Feasibility: 1.1x | Stack: 0.7x | Trend: 0.9x
|
|
103
|
+
Note: Demand and Revenue are your strongest predictors of successful projects.
|
|
104
|
+
```
|
|
105
|
+
|
|
106
|
+
### Bootstrap
|
|
107
|
+
|
|
108
|
+
If fewer than 10 outcomes, use default equal weights (1.0 each).
|
|
109
|
+
|
|
110
|
+
---
|
|
111
|
+
|
|
112
|
+
## 3. Research Query Evolution
|
|
113
|
+
|
|
114
|
+
Track which search queries return useful results and evolve the query set over time.
|
|
115
|
+
|
|
116
|
+
### Data: `~/.ideabox/query-performance.jsonl`
|
|
117
|
+
|
|
118
|
+
Append after each research subagent returns:
|
|
119
|
+
```json
|
|
120
|
+
{
|
|
121
|
+
"ts": "{ISO}",
|
|
122
|
+
"session_id": "...",
|
|
123
|
+
"source": "agentic_ai",
|
|
124
|
+
"query": "MCP server gaps 2026",
|
|
125
|
+
"results_count": 8,
|
|
126
|
+
"useful_results": 3,
|
|
127
|
+
"contributed_to_idea": true
|
|
128
|
+
}
|
|
129
|
+
```
|
|
130
|
+
|
|
131
|
+
### Query Lifecycle
|
|
132
|
+
|
|
133
|
+
Queries go through a lifecycle:
|
|
134
|
+
|
|
135
|
+
```
|
|
136
|
+
NEW → ACTIVE → (PRODUCTIVE | DECLINING) → (KEEP | RETIRE | REPLACE)
|
|
137
|
+
```
|
|
138
|
+
|
|
139
|
+
**Scoring per query:**
|
|
140
|
+
```
|
|
141
|
+
query_score = (useful_results / total_results) * 0.5 + (contributed_to_idea ? 0.5 : 0)
|
|
142
|
+
```
|
|
143
|
+
|
|
144
|
+
**Lifecycle rules:**
|
|
145
|
+
- **NEW** (< 3 uses): always run, building data
|
|
146
|
+
- **ACTIVE** (3+ uses, score > 0.3): keep running
|
|
147
|
+
- **PRODUCTIVE** (5+ uses, score > 0.5): expand with variations
|
|
148
|
+
- **DECLINING** (5+ uses, score < 0.2 for 3 consecutive sessions): mark for retirement
|
|
149
|
+
- **RETIRED** (marked declining for 3+ sessions): stop running, but keep in archive
|
|
150
|
+
|
|
151
|
+
### Query Evolution
|
|
152
|
+
|
|
153
|
+
When a query is PRODUCTIVE, generate 1-2 variations:
|
|
154
|
+
- Original: `"MCP server gaps 2026"`
|
|
155
|
+
- Variation 1: `"MCP server most requested features"`
|
|
156
|
+
- Variation 2: `"MCP server community wishlist"`
|
|
157
|
+
|
|
158
|
+
When a query is RETIRED, generate a replacement based on what IS working:
|
|
159
|
+
- If agentic_ai queries are productive: generate more specific variants
|
|
160
|
+
- If a new technology trend emerges from results: add queries for it
|
|
161
|
+
|
|
162
|
+
### How Queries Evolve
|
|
163
|
+
|
|
164
|
+
During the research phase:
|
|
165
|
+
1. Read `~/.ideabox/query-performance.jsonl`
|
|
166
|
+
2. Compute query scores
|
|
167
|
+
3. Build the active query set: all NEW + ACTIVE + PRODUCTIVE queries
|
|
168
|
+
4. Drop RETIRED queries
|
|
169
|
+
5. For PRODUCTIVE queries, add variations if not already tracked
|
|
170
|
+
6. Present query health in the Research Coverage stats:
|
|
171
|
+
|
|
172
|
+
```
|
|
173
|
+
## Query Health
|
|
174
|
+
- Active queries: 18 (6 productive, 9 active, 3 new)
|
|
175
|
+
- Retired this session: 2 (low yield for 3+ sessions)
|
|
176
|
+
- New variations added: 1 ("AI agent testing framework" from productive "MCP server gaps")
|
|
177
|
+
```
|
|
178
|
+
|
|
179
|
+
### Bootstrap
|
|
180
|
+
|
|
181
|
+
If `query-performance.jsonl` doesn't exist, use the default queries from `references/research-sources.md`. All queries start as NEW.
|
|
182
|
+
|
|
183
|
+
### Query Storage
|
|
184
|
+
|
|
185
|
+
The evolved query set is NOT stored in a separate file — it's computed from `query-performance.jsonl` on each run. The `research-sources.md` file remains the default/seed query set. Evolved queries are derived from performance data.
|
|
186
|
+
|
|
187
|
+
---
|
|
188
|
+
|
|
189
|
+
## Integration Points
|
|
190
|
+
|
|
191
|
+
### Phase 01 (Research) — reads all three:
|
|
192
|
+
1. Before launching subagents: load source quality scores, adapted scoring weights, and active query set
|
|
193
|
+
2. Allocate more queries to high-quality sources
|
|
194
|
+
3. Use adapted weights in scoring step
|
|
195
|
+
4. Use evolved query set instead of defaults (if enough data exists)
|
|
196
|
+
|
|
197
|
+
### Phase 09 (Learn) — writes all three:
|
|
198
|
+
1. After session outcome: append to `source-quality.jsonl` (which sources contributed)
|
|
199
|
+
2. After terminal idea state: append to `scoring-feedback.jsonl` (scores vs outcome)
|
|
200
|
+
3. After research phase: append to `query-performance.jsonl` (per-query results)
|
|
201
|
+
|
|
202
|
+
### Cold Start Protection
|
|
203
|
+
- Source quality: activates after 5 sessions
|
|
204
|
+
- Scoring weights: activates after 10 scored outcomes
|
|
205
|
+
- Query evolution: activates after 3 uses per query
|
|
206
|
+
|
|
207
|
+
All three fall back to defaults when insufficient data exists. No degradation for new users.
|
|
208
|
+
|
|
209
|
+
---
|
|
210
|
+
|
|
211
|
+
## Anti-Drift Safeguards
|
|
212
|
+
|
|
213
|
+
1. **Score clamping**: source scores [0.1, 1.0], dimension weights [0.5, 1.5]
|
|
214
|
+
2. **Never eliminate**: low-scoring sources get reduced queries, not removed
|
|
215
|
+
3. **Archive, don't delete**: retired queries kept in log for trend analysis
|
|
216
|
+
4. **Periodic reset option**: user can run `/ideas profile` and choose "reset learning" to clear all feedback data and start fresh
|
|
217
|
+
5. **Transparency**: every adapted weight and source score is shown to the user — no hidden adjustments
|
|
@@ -0,0 +1,97 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: profile
|
|
3
|
+
description: >
|
|
4
|
+
Manage your IdeaBox profile — interests, tech stacks, goals, and GitHub username.
|
|
5
|
+
Use when asked to "set up ideabox", "edit ideabox profile", "change interests",
|
|
6
|
+
"update my stack", or "ideabox profile".
|
|
7
|
+
This skill is also invoked automatically on first /ideas run if no profile exists.
|
|
8
|
+
---
|
|
9
|
+
|
|
10
|
+
# IdeaBox Profile Manager
|
|
11
|
+
|
|
12
|
+
Manage the user's IdeaBox profile stored at `~/.ideabox/profile.json`.
|
|
13
|
+
|
|
14
|
+
## If profile.json exists
|
|
15
|
+
|
|
16
|
+
Read `~/.ideabox/profile.json` and display the current profile:
|
|
17
|
+
|
|
18
|
+
```
|
|
19
|
+
## Your IdeaBox Profile
|
|
20
|
+
|
|
21
|
+
**GitHub:** {github_username}
|
|
22
|
+
**Interests:** {interests as comma-separated list}
|
|
23
|
+
**Tech Stacks:** {stacks as comma-separated list}
|
|
24
|
+
**Goals:** {goals as comma-separated list}
|
|
25
|
+
**Avoid:** {avoid_topics or "none"}
|
|
26
|
+
**Exploration rate:** {exploration_rate}%
|
|
27
|
+
**Total ideas seen:** {total_interactions}
|
|
28
|
+
**Last updated:** {updated_at}
|
|
29
|
+
```
|
|
30
|
+
|
|
31
|
+
Then ask: "Want to update anything? (interests / stacks / goals / avoid topics / GitHub username / reset learning)"
|
|
32
|
+
|
|
33
|
+
Update only the fields the user specifies. Set `updated_at` to current ISO timestamp.
|
|
34
|
+
|
|
35
|
+
### Reset Learning
|
|
36
|
+
|
|
37
|
+
If the user chooses "reset learning":
|
|
38
|
+
1. Confirm: "This will reset all self-improvement data (source quality, scoring weights, query performance, preference scores). Your profile settings (interests, stacks, goals) will be kept. Proceed? (yes/no)"
|
|
39
|
+
2. If yes:
|
|
40
|
+
- Delete `~/.ideabox/source-quality.jsonl`
|
|
41
|
+
- Delete `~/.ideabox/scoring-feedback.jsonl`
|
|
42
|
+
- Delete `~/.ideabox/query-performance.jsonl`
|
|
43
|
+
- Delete `~/.ideabox/preferences.jsonl`
|
|
44
|
+
- Reset `profile.json` fields: `category_scores: {}`, `complexity_preference: {}`, `monetization_preference: {}`, `exploration_rate: 0.30`, `total_interactions: 0`
|
|
45
|
+
- Confirm: "Learning data reset. IdeaBox will start fresh with default weights and full exploration."
|
|
46
|
+
|
|
47
|
+
## If profile.json does NOT exist
|
|
48
|
+
|
|
49
|
+
Run the first-time setup. Ask these questions ONE AT A TIME:
|
|
50
|
+
|
|
51
|
+
### Question 1
|
|
52
|
+
"What's your GitHub username? (I'll scan your repos to understand your tech stack)"
|
|
53
|
+
|
|
54
|
+
### Question 2
|
|
55
|
+
"What are you most interested in building? Pick all that apply:
|
|
56
|
+
A) Developer tools (CLIs, plugins, MCP servers, VS Code extensions)
|
|
57
|
+
B) SaaS apps (web apps with subscriptions)
|
|
58
|
+
C) Mobile apps (React Native, Expo, Flutter)
|
|
59
|
+
D) AI/ML tools (agents, wrappers, fine-tuning tools)
|
|
60
|
+
E) Open source libraries (npm packages, crates, PyPI)
|
|
61
|
+
F) Desktop apps (Tauri, Electron)
|
|
62
|
+
G) Other (tell me)"
|
|
63
|
+
|
|
64
|
+
### Question 3
|
|
65
|
+
"What's your primary goal?
|
|
66
|
+
A) Make money (monetization-first ideas)
|
|
67
|
+
B) Build great open source (community impact)
|
|
68
|
+
C) Both — show me ideas that could do either
|
|
69
|
+
D) Learn new tech (ideas that push my skills)"
|
|
70
|
+
|
|
71
|
+
### Question 4 (optional)
|
|
72
|
+
"Any topics you want me to avoid? (e.g., crypto, social media, gaming — or 'none')"
|
|
73
|
+
|
|
74
|
+
Save the profile:
|
|
75
|
+
|
|
76
|
+
```json
|
|
77
|
+
{
|
|
78
|
+
"github_username": "{answer1}",
|
|
79
|
+
"interests": ["{mapped from answer2}"],
|
|
80
|
+
"stacks": [],
|
|
81
|
+
"goals": ["{mapped from answer3}"],
|
|
82
|
+
"avoid_topics": ["{answer4 or empty}"],
|
|
83
|
+
"category_scores": {},
|
|
84
|
+
"complexity_preference": {},
|
|
85
|
+
"monetization_preference": {},
|
|
86
|
+
"exploration_rate": 0.30,
|
|
87
|
+
"total_interactions": 0,
|
|
88
|
+
"created_at": "{ISO timestamp}",
|
|
89
|
+
"updated_at": "{ISO timestamp}"
|
|
90
|
+
}
|
|
91
|
+
```
|
|
92
|
+
|
|
93
|
+
The `stacks` array will be populated by the research phase when it scans the user's GitHub profile.
|
|
94
|
+
|
|
95
|
+
The `category_scores`, `complexity_preference`, `monetization_preference`, `exploration_rate`, and `total_interactions` fields are used by the self-improving recommendation system (Phase 09 — Learn).
|
|
96
|
+
|
|
97
|
+
After saving, confirm: "Profile saved! Run `/ideas` to get your first batch of project ideas."
|
|
@@ -0,0 +1,62 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: research
|
|
3
|
+
description: >
|
|
4
|
+
Research-only mode — browse project ideas without committing to build.
|
|
5
|
+
Use when asked to "research ideas", "browse ideas", "just show me ideas",
|
|
6
|
+
"ideabox research", "what's trending", or "find project ideas".
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
# IdeaBox Research (Browse Only)
|
|
10
|
+
|
|
11
|
+
Research and present project ideas without entering the full build pipeline.
|
|
12
|
+
This is the "window shopping" mode — look at ideas, save the ones you like, dismiss the rest.
|
|
13
|
+
|
|
14
|
+
## Process
|
|
15
|
+
|
|
16
|
+
This skill delegates to the shared research phase for the actual research logic, but runs in browse-only mode (no automatic transition to brainstorm/plan/build).
|
|
17
|
+
|
|
18
|
+
### Step 1: Setup
|
|
19
|
+
|
|
20
|
+
1. Read `~/.ideabox/profile.json`. If it doesn't exist, invoke the `ideabox:profile` skill for setup, then continue.
|
|
21
|
+
2. Create `.ideabox/` directory if needed (for state tracking).
|
|
22
|
+
|
|
23
|
+
### Step 2: Execute Research Phase
|
|
24
|
+
|
|
25
|
+
Read `${CLAUDE_SKILL_DIR}/phases/01-research.md` and follow Steps 1-6 (research, score, validate, stats, auto-save, present).
|
|
26
|
+
|
|
27
|
+
**Browse-only override:** When the user picks an idea, handle differently than the full pipeline:
|
|
28
|
+
|
|
29
|
+
### Step 3: User Actions
|
|
30
|
+
|
|
31
|
+
After presenting ideas, offer:
|
|
32
|
+
- **save N** — Save idea #N to backlog (append to `~/.ideabox/ideas.jsonl` with status "saved")
|
|
33
|
+
- **dismiss N** — Dismiss idea #N (append with status "dismissed")
|
|
34
|
+
- **more** — Show 3-5 more ideas from the research results
|
|
35
|
+
- **build N** — Execute Steps 4 and 5 first (log session + track preferences), then save with status "planned", initialize `.ideabox/state.json` with `current_phase: "02-brainstorm"`, and read `${CLAUDE_SKILL_DIR}/phases/02-brainstorm.md`
|
|
36
|
+
- **compare N M** — Side-by-side comparison of two ideas (same format as backlog compare)
|
|
37
|
+
- **done** — Finish browsing
|
|
38
|
+
|
|
39
|
+
### Step 4: Log Session
|
|
40
|
+
|
|
41
|
+
Append to `~/.ideabox/sessions.jsonl`:
|
|
42
|
+
```json
|
|
43
|
+
{
|
|
44
|
+
"session_id": "sess_YYYYMMDD_HHMMSS",
|
|
45
|
+
"timestamp": "{ISO}",
|
|
46
|
+
"mode": "research",
|
|
47
|
+
"sources_searched": "{actual count of subagents that returned results}",
|
|
48
|
+
"ideas_generated": "{count}",
|
|
49
|
+
"ideas_presented": "{count}",
|
|
50
|
+
"ideas_saved": "{count}",
|
|
51
|
+
"ideas_dismissed": "{count}"
|
|
52
|
+
}
|
|
53
|
+
```
|
|
54
|
+
|
|
55
|
+
### Step 5: Track Preferences
|
|
56
|
+
|
|
57
|
+
Append to `~/.ideabox/preferences.jsonl` for each user action:
|
|
58
|
+
```json
|
|
59
|
+
{"ts":"{ISO}","event":"suggested","idea_id":"{id}","category":"{cat}","complexity":"{complexity}","monetization":"{model}"}
|
|
60
|
+
{"ts":"{ISO}","event":"accepted","idea_id":"{id}"}
|
|
61
|
+
{"ts":"{ISO}","event":"dismissed","idea_id":"{id}"}
|
|
62
|
+
```
|