opencode-design-lab 0.0.0 → 0.0.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,15 +1,16 @@
1
1
  # OpenCode Design Lab
2
2
 
3
- An OpenCode plugin that generates multiple independent design proposals using different AI models, then systematically evaluates, compares, and ranks those designs in a reproducible and structured way.
3
+ An OpenCode plugin that registers a primary design agent and model-specific
4
+ subagents to generate and review designs directly to Markdown files.
4
5
 
5
6
  ## Overview
6
7
 
7
- OpenCode Design Lab treats design as an experimental artifact, not a chat response. It enforces:
8
+ Design Lab uses a file-first, multi-model workflow:
8
9
 
9
- - **Isolation**: Each design agent works independently without seeing other designs
10
- - **Structure**: All outputs follow predefined JSON schemas
11
- - **Evaluation**: Multiple reviewers score designs across consistent dimensions
12
- - **Reproducibility**: Given the same inputs and config, results are reproducible
10
+ - **Dynamic model mapping**: Subagents are created from your config
11
+ - **Correct model usage**: Each subagent is bound to its configured model
12
+ - **File-first outputs**: Designs and reviews are written to disk, not chat
13
+ - **Cross-review**: The same model set reviews all designs in a single report
13
14
 
14
15
  ## Installation
15
16
 
@@ -38,7 +39,8 @@ Then add to your OpenCode config (`~/.config/opencode/opencode.json`):
38
39
 
39
40
  ## Configuration
40
41
 
41
- Create a config file at `~/.config/opencode/design-lab.json` or `.opencode/design-lab.json`:
42
+ Create a config file at `~/.config/opencode/design-lab.json` or
43
+ `.opencode/design-lab.json`:
42
44
 
43
45
  ```json
44
46
  {
@@ -57,315 +59,68 @@ Create a config file at `~/.config/opencode/design-lab.json` or `.opencode/desig
57
59
  | `design_models` | `string[]` | **Required** | Models to use for design generation (min 2) |
58
60
  | `review_models` | `string[]` | `design_models` | Models to use for reviews. Defaults to all design models if not specified |
59
61
  | `base_output_dir` | `string` | `.design-lab` | Base directory for design lab outputs |
60
- | `design_agent_temperature` | `number` | `0.7` | Temperature for design agents (0-2) |
61
- | `review_agent_temperature` | `number` | `0.1` | Temperature for review agents (0-2) |
62
- | `topic_generator_model` | `string` | First design model | Model to use for generating topic names |
62
+ | `design_agent_temperature` | `number` | `0.7` | Reserved for future use |
63
+ | `review_agent_temperature` | `number` | `0.1` | Reserved for future use |
64
+ | `topic_generator_model` | `string` | First design model | Reserved for future use |
63
65
 
64
66
  ## Usage
65
67
 
66
- ### 1. Generate Designs
68
+ ### 1. Ask the primary agent to generate designs
67
69
 
68
- ```
69
- Use the generate_designs tool with requirements:
70
- "Design a real-time collaborative document editor with conflict resolution,
71
- supporting rich text editing, multiple cursors, and offline mode."
72
- ```
73
-
74
- This will:
75
-
76
- - Create a directory `.design-lab/YYYY-MM-DD-{topic}/`
77
- - Generate independent designs from each configured model
78
- - Save designs as JSON in `designs/` directory
79
- - Validate all designs against the schema
80
-
81
- **Output:**
70
+ Use the `designer` agent. Example prompt:
82
71
 
83
72
  ```
84
- Design generation complete.
85
-
86
- Lab Directory: .design-lab/2026-01-22-collaborative-editor/
87
-
88
- Results: 3 successful, 0 failed
89
-
90
- ✅ claude-sonnet-4: Generated successfully
91
- ✅ gpt-4o: Generated successfully
92
- ✅ gemini-3-pro: Generated successfully
93
-
94
- Next step: Run the review_designs tool to evaluate and compare the designs.
73
+ Ask all designer_model subagents to design a deepwiki clone. Output each design
74
+ as a Markdown file with the model name as the filename.
95
75
  ```
96
76
 
97
- ### 2. Review Designs
98
-
99
- ```
100
- Use the review_designs tool
101
- ```
77
+ The primary agent will:
102
78
 
103
- This will:
79
+ - Create a run directory under `.design-lab/YYYY-MM-DD-topic/`
80
+ - Delegate design generation to each `designer_model_*` subagent
81
+ - Save designs to `designs/*.md`
104
82
 
105
- - Load all generated designs
106
- - Send them to each review model
107
- - Generate markdown reviews comparing all designs
108
- - Extract structured scores (0-10) across dimensions:
109
- - Clarity
110
- - Feasibility
111
- - Scalability
112
- - Maintainability
113
- - Completeness
114
- - Overall
83
+ ### 2. Ask for cross-reviews
115
84
 
116
- **Output:**
85
+ Use the same `designer` agent to trigger reviews:
117
86
 
118
87
  ```
119
- Review complete.
120
-
121
- Lab Directory: .design-lab/2026-01-22-collaborative-editor/
122
-
123
- Results: 2 successful, 0 failed
124
-
125
- ✅ claude-opus-4: Review generated
126
- ✅ gpt-5-2: Review generated
127
-
128
- Reviews saved to: .design-lab/2026-01-22-collaborative-editor/reviews/
129
- Scores saved to: .design-lab/2026-01-22-collaborative-editor/scores/
130
-
131
- Next step: Run the aggregate_scores tool to generate final rankings.
88
+ Now ask the same set of models to review all designs. Each reviewer outputs one
89
+ Markdown report comparing all designs at once.
132
90
  ```
133
91
 
134
- ### 3. Aggregate Scores
135
-
136
- ```
137
- Use the aggregate_scores tool
138
- ```
139
-
140
- This will:
141
-
142
- - Parse all score files
143
- - Calculate average scores per design
144
- - Compute variance/disagreement metrics
145
- - Generate final rankings
146
- - Create `results.md` with comparative analysis
147
-
148
- **Output:**
149
-
150
- ```
151
- Aggregation complete.
152
-
153
- Rankings saved to: .design-lab/2026-01-22-collaborative-editor/results/ranking.json
154
- Results summary saved to: .design-lab/2026-01-22-collaborative-editor/results/results.md
155
-
156
- Final Rankings
157
-
158
- 1. **gpt-4o** - Score: 8.4/10 (variance: 0.32)
159
- 2. **claude-sonnet-4** - Score: 8.1/10 (variance: 0.28)
160
- 3. **gemini-3-pro** - Score: 7.8/10 (variance: 0.45)
161
- ```
92
+ Review files are saved to `reviews/review-*.md`.
162
93
 
163
94
  ## Output Structure
164
95
 
165
- Each design lab session creates a timestamped directory:
96
+ Each run creates a timestamped directory:
166
97
 
167
98
  ```
168
- .design-lab/YYYY-MM-DD-{topic}/
169
- ├── task.json # Original requirements and config
170
- ├── designs/ # Generated designs
171
- │ ├── claude-sonnet-4.json
172
- ├── gpt-4o.json
173
- └── gemini-3-pro.json
174
- ├── reviews/ # Markdown reviews
175
- │ ├── review-claude-opus-4.md
176
- │ └── review-gpt-5-2.md
177
- ├── scores/ # Structured scores
178
- │ ├── claude-sonnet-4-by-claude-opus-4.json
179
- │ ├── claude-sonnet-4-by-gpt-5-2.json
180
- │ ├── gpt-4o-by-claude-opus-4.json
181
- │ └── ...
182
- └── results/ # Final aggregation
183
- ├── ranking.json # Numeric rankings
184
- └── results.md # Human-readable summary
185
- ```
186
-
187
- ## Design Artifact Schema
188
-
189
- Each design must conform to this structure:
190
-
191
- ```typescript
192
- {
193
- title: string;
194
- summary: string;
195
- assumptions: string[];
196
- architecture_overview: string;
197
- components: Array<{
198
- name: string;
199
- description: string;
200
- responsibilities: string[];
201
- }>;
202
- data_flow: string;
203
- tradeoffs: Array<{
204
- aspect: string;
205
- options: string[];
206
- chosen: string;
207
- rationale: string;
208
- }>;
209
- risks: Array<{
210
- risk: string;
211
- impact: "low" | "medium" | "high";
212
- mitigation: string;
213
- }>;
214
- open_questions: string[];
215
- }
216
- ```
217
-
218
- ## Score Schema
219
-
220
- Reviewers produce scores following this structure:
221
-
222
- ```typescript
223
- {
224
- design_id: string;
225
- reviewer_model: string;
226
- scores: {
227
- clarity: number; // 0-10
228
- feasibility: number; // 0-10
229
- scalability: number; // 0-10
230
- maintainability: number; // 0-10
231
- completeness: number; // 0-10
232
- overall: number; // 0-10
233
- };
234
- justification: string;
235
- strengths: string[];
236
- weaknesses: string[];
237
- missing_considerations: string[];
238
- }
99
+ .design-lab/YYYY-MM-DD-topic/
100
+ ├── designs/
101
+ ├── claude-sonnet-4.md
102
+ │ ├── gpt-4o.md
103
+ └── gemini-3-pro.md
104
+ └── reviews/
105
+ ├── review-claude-opus-4.md
106
+ └── review-gpt-5-2.md
239
107
  ```
240
108
 
241
- ## How It Works
242
-
243
- ### Multi-Agent Architecture
244
-
245
- Based on patterns from [oh-my-opencode](https://github.com/code-yeongyu/oh-my-opencode), each agent runs in its own OpenCode session:
246
-
247
- 1. **Create Session**: `ctx.client.session.create({ ... })`
248
- 2. **Send Prompt**: `ctx.client.session.prompt({ agent: model, ... })`
249
- 3. **Poll Completion**: Check `session.status()` until idle
250
- 4. **Extract Output**: Parse `session.messages()` for JSON
251
-
252
- ### Sequential Execution
253
-
254
- Design Lab v1 runs agents **sequentially** (one after another) rather than in parallel. This:
255
-
256
- - Simplifies implementation
257
- - Avoids overwhelming the session manager
258
- - Still provides multiple independent perspectives
259
-
260
- ### Schema Validation
261
-
262
- All outputs are validated using Zod schemas:
263
-
264
- - Design artifacts validated before saving
265
- - Scores validated during review
266
- - JSON schemas auto-generated via `z.toJSONSchema()` (Zod v4)
267
-
268
109
  ## Development
269
110
 
270
- ### Build
271
-
272
111
  ```bash
112
+ # Build the plugin (outputs to .opencode/plugins/design-lab.js)
273
113
  bun run build
274
- ```
275
-
276
- Output: `.opencode/plugins/design-lab.js`
277
114
 
278
- ### Generate JSON Schemas
115
+ # Development with watch mode
116
+ bun run dev
279
117
 
280
- ```bash
281
- bun src/utils/schema-export.ts
282
- ```
283
-
284
- Output: `schemas/*.schema.json`
118
+ # Run tests (vitest)
119
+ bun run test
285
120
 
286
- ### Project Structure
121
+ # Format code with prettier
122
+ bun run format
287
123
 
124
+ # Type checking
125
+ bun run typecheck
288
126
  ```
289
- src/
290
- ├── design-lab.ts # Plugin entry point
291
- ├── agents/
292
- │ └── index.ts # Agent factory functions
293
- ├── config/
294
- │ ├── schema.ts # Zod schemas
295
- │ ├── loader.ts # Config loading
296
- │ └── index.ts
297
- ├── tools/
298
- │ ├── generate-designs.ts # Design generation orchestrator
299
- │ ├── review-designs.ts # Review orchestrator
300
- │ ├── aggregate-scores.ts # Score aggregation
301
- │ └── index.ts
302
- └── utils/
303
- ├── session-helpers.ts # OpenCode session utilities
304
- └── schema-export.ts # Schema generator
305
- ```
306
-
307
- ## Examples
308
-
309
- ### Example: API Gateway Design
310
-
311
- ```
312
- Use generate_designs with requirements:
313
- "Design a high-performance API gateway for microservices.
314
- Must support:
315
- - Rate limiting and throttling
316
- - Authentication and authorization
317
- - Request/response transformation
318
- - Service discovery
319
- - Circuit breaking
320
- - Monitoring and observability
321
- Target: 100,000+ requests/second
322
- Constraints: Cloud-native, Kubernetes deployment"
323
- ```
324
-
325
- ### Example: Deepwiki Clone
326
-
327
- ```
328
- Use generate_designs with requirements:
329
- "Design a DeepWiki clone - a service that indexes GitHub repositories
330
- and provides AI-powered search and Q&A over the codebase.
331
- Must support:
332
- - Repository indexing and updates
333
- - Vector search over code
334
- - Multi-language support
335
- - Usage tracking and analytics
336
- - API rate limiting
337
- Constraints: Open source, self-hostable"
338
- ```
339
-
340
- ## Design Philosophy
341
-
342
- > The goal is not simply to pick the "best" design, but to extract the best practices and insights from each model's design, then merge them into a superior composite design. Each model contributes unique strengths that can be combined to create a more robust solution.
343
-
344
- - **Multiple Perspectives**: Different models bring different strengths
345
- - **Structured Comparison**: Objective scoring across consistent dimensions
346
- - **Reproducible Process**: Same inputs → same structure (within model variance)
347
- - **Design as Artifact**: Not a conversation, but a versioned document
348
-
349
- ## Roadmap (Future)
350
-
351
- - [ ] Background execution with progress notifications
352
- - [ ] Iterative refinement loops
353
- - [ ] Pairwise ranking (Elo-style)
354
- - [ ] Human-in-the-loop scoring
355
- - [ ] Design isolation hook (prevent agents reading other designs)
356
- - [ ] Visualization dashboard
357
- - [ ] Design merging/synthesis tool
358
-
359
- ## Contributing
360
-
361
- Contributions welcome! Please read [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.
362
-
363
- ## License
364
-
365
- MIT
366
-
367
- ## References
368
-
369
- - [OpenCode](https://github.com/sst/opencode) - The extensible AI coding assistant
370
- - [oh-my-opencode](https://github.com/code-yeongyu/oh-my-opencode) - Multi-agent patterns
371
- - [PRD.md](PRD.md) - Full requirements specification
package/package.json CHANGED
@@ -1,7 +1,7 @@
1
1
  {
2
2
  "name": "opencode-design-lab",
3
3
  "type": "module",
4
- "version": "0.0.0",
4
+ "version": "0.0.4",
5
5
  "description": "An OpenCode plugin that generates multiple independent design proposals using different AI models, then systematically evaluates, compares, and ranks those designs in a reproducible and structured way.",
6
6
  "author": "OpenCode Design Lab Contributors",
7
7
  "license": "MIT",
@@ -13,10 +13,7 @@
13
13
  "bugs": {
14
14
  "url": "https://github.com/HuakunShen/opencode-design-lab/issues"
15
15
  },
16
- "exports": {
17
- ".": "./.opencode/plugins/design-lab.js",
18
- "./package.json": "./package.json"
19
- },
16
+ "main": ".opencode/plugins/design-lab.js",
20
17
  "files": [
21
18
  ".opencode/plugins"
22
19
  ],