@adaptic/maestro 1.1.6 → 1.1.8

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,349 @@
1
+ # Media Generation Setup Guide
2
+
3
+ How to generate branded illustrations, diagrams, and video assets using Google Gemini and Veo APIs. Covers API setup, prompt specification authoring, image generation, video generation, brand alignment, and troubleshooting.
4
+
5
+ **Prerequisites**: Complete the [Mac Mini Bootstrap](../runbooks/mac-mini-bootstrap.md). Node.js 20+ must be installed.
6
+
7
+ ---
8
+
9
+ ## Architecture Overview
10
+
11
+ ```
12
+ ┌─────────────────────────────────────────────────────────────────┐
13
+ │ PROMPT SPECS (TypeScript/ESM files) │
14
+ │ │
15
+ │ scripts/media-generation/prompts/ │
16
+ │ ├── illustrations/ *.ts / *.mjs │
17
+ │ ├── diagrams/ *.ts / *.mjs │
18
+ │ └── videos/ *.ts / *.mjs │
19
+ │ │
20
+ │ Each spec exports: { id, slug, title, purpose, style, │
21
+ │ colorPalette, mood, elements, aspectRatio, ... } │
22
+ ├─────────────────────────────────────────────────────────────────┤
23
+ │ GENERATION │
24
+ │ │
25
+ │ generate-assets.mjs │
26
+ │ ├── --illustration <slug> → gemini-image-client.mjs │
27
+ │ │ └── @google/genai SDK → Gemini 3 Pro / Imagen │
28
+ │ │ └── public/generated/illustrations/<slug>.png │
29
+ │ │ │
30
+ │ ├── --video <slug> → veo-video-client.mjs │
31
+ │ │ └── @google/genai SDK → Veo 3.1 │
32
+ │ │ └── public/generated/videos/<slug>.mp4 │
33
+ │ │ │
34
+ │ ├── --all-missing → batch generate all missing illustrations │
35
+ │ └── --all-missing-videos → batch generate all missing videos │
36
+ │ │
37
+ │ Rate limit: 15s between requests │
38
+ │ Generation log: scripts/media-generation/generation-log.json │
39
+ ├─────────────────────────────────────────────────────────────────┤
40
+ │ OUTPUT │
41
+ │ │
42
+ │ public/generated/ │
43
+ │ ├── illustrations/ *.png │
44
+ │ ├── diagrams/ *.png │
45
+ │ └── videos/ *.mp4 │
46
+ └─────────────────────────────────────────────────────────────────┘
47
+ ```
48
+
49
+ ---
50
+
51
+ ## 1. API Setup
52
+
53
+ ### 1.1 Get a Gemini API Key
54
+
55
+ 1. Go to https://aistudio.google.com/apikey
56
+ 2. Create a new API key (or use an existing one)
57
+ 3. The same key works for both image (Gemini/Imagen) and video (Veo) generation
58
+
59
+ ### 1.2 Configure Environment
60
+
61
+ Add to `.env`:
62
+
63
+ ```bash
64
+ GEMINI_API_KEY=your-api-key-here
65
+ ```
66
+
67
+ ### 1.3 Install Dependencies
68
+
69
+ ```bash
70
+ cd ~/agent-repo && npm install
71
+ ```
72
+
73
+ Required packages (in `package.json`):
74
+ - `@google/genai` — Google AI SDK for Gemini and Veo
75
+ - `dotenv` — Environment variable loading
76
+
77
+ ### 1.4 Verify Setup
78
+
79
+ ```bash
80
+ # Quick test — list available prompt specs
81
+ node scripts/media-generation/generate-assets.mjs --list
82
+ ```
83
+
84
+ ---
85
+
86
+ ## 2. Image Generation (Gemini / Imagen)
87
+
88
+ ### 2.1 Generate a Single Illustration
89
+
90
+ ```bash
91
+ node scripts/media-generation/generate-assets.mjs --illustration <slug>
92
+ ```
93
+
94
+ The `<slug>` must match a prompt spec file in `scripts/media-generation/prompts/illustrations/`.
95
+
96
+ ### 2.2 Generate All Missing
97
+
98
+ ```bash
99
+ # Generate illustrations for all specs that don't have output files yet
100
+ node scripts/media-generation/generate-assets.mjs --all-missing
101
+ ```
102
+
103
+ ### 2.3 Models
104
+
105
+ | Model | Quality | Speed | Use Case |
106
+ |---|---|---|---|
107
+ | `gemini-3-pro-image-preview` | Best | Slower | Default — board packs, investor materials |
108
+ | `imagen-*` variants | Good | Fast | High-volume generation, social media |
109
+
110
+ Override via prompt spec or environment:
111
+
112
+ ```bash
113
+ GEMINI_MODEL=imagen-3.0-generate-002 node scripts/media-generation/generate-assets.mjs --illustration <slug>
114
+ ```
115
+
116
+ ### 2.4 Aspect Ratios
117
+
118
+ Supported: `1:1`, `3:4`, `4:3`, `9:16`, `16:9`
119
+
120
+ Set per prompt spec in the `aspectRatio` field.
121
+
122
+ ---
123
+
124
+ ## 3. Video Generation (Veo 3.1)
125
+
126
+ ### 3.1 Generate a Video
127
+
128
+ ```bash
129
+ node scripts/media-generation/generate-assets.mjs --video <slug>
130
+ ```
131
+
132
+ ### 3.2 Generate All Missing Videos
133
+
134
+ ```bash
135
+ node scripts/media-generation/generate-assets.mjs --all-missing-videos
136
+ ```
137
+
138
+ ### 3.3 Configuration
139
+
140
+ | Setting | Default | Options |
141
+ |---|---|---|
142
+ | Model | `veo-3.1-generate-preview` | — |
143
+ | Duration | 8 seconds | 4, 6, 8 |
144
+ | Resolution | 720p | 720p, 1080p, 4k |
145
+ | Modes | text-to-video | text-to-video, image-to-video |
146
+
147
+ ### 3.4 Polling
148
+
149
+ Video generation is asynchronous. The client:
150
+ 1. Submits the generation request
151
+ 2. Polls every 10 seconds for completion
152
+ 3. Times out after 10 minutes
153
+ 4. Downloads the video file on completion
154
+
155
+ ---
156
+
157
+ ## 4. Prompt Specifications
158
+
159
+ ### 4.1 Structure
160
+
161
+ Prompts are TypeScript (`.ts`) or ESM (`.mjs`) files in `scripts/media-generation/prompts/`:
162
+
163
+ ```
164
+ prompts/
165
+ illustrations/ # Board pack covers, section illustrations
166
+ diagrams/ # Architecture diagrams, flow charts
167
+ videos/ # Animated intros, presentation backgrounds
168
+ ```
169
+
170
+ ### 4.2 Writing a Prompt Spec
171
+
172
+ ```typescript
173
+ // scripts/media-generation/prompts/illustrations/board-pack-cover-q2.ts
174
+ export default {
175
+ id: "board-pack-cover-q2-2026",
176
+ slug: "board-pack-cover-q2-2026",
177
+ title: "Q2 2026 Board Pack Cover Illustration",
178
+ purpose: "Cover page visual for quarterly board materials",
179
+ imageCategory: "editorial-hand-drawn",
180
+ colorPalette: ["#1C1917", "#FAFAF9", "#78716C", "#D6D3D1"],
181
+ mood: ["institutional", "sophisticated", "premium"],
182
+ elements: [
183
+ "abstract representation of global financial network",
184
+ "interconnected nodes suggesting multi-jurisdiction structure",
185
+ "subtle reference to algorithmic trading"
186
+ ],
187
+ style: "Monochromatic NYT-style hand-drawn editorial illustration. " +
188
+ "Pen-and-ink technique with cross-hatching and stippling. " +
189
+ "No color, no gradients, no digital effects.",
190
+ aspectRatio: "3:4",
191
+ };
192
+ ```
193
+
194
+ ### 4.3 Key Fields
195
+
196
+ | Field | Required | Description |
197
+ |---|---|---|
198
+ | `id` | Yes | Unique identifier |
199
+ | `slug` | Yes | Filename-safe identifier (used for output filename) |
200
+ | `title` | Yes | Human-readable title |
201
+ | `purpose` | Yes | What this asset is used for |
202
+ | `style` | Yes | Detailed style description for the AI model |
203
+ | `colorPalette` | No | Hex colour values (brand-aligned) |
204
+ | `mood` | No | Tone descriptors |
205
+ | `elements` | No | Visual elements to include |
206
+ | `aspectRatio` | No | Output aspect ratio (default: `16:9`) |
207
+ | `imageCategory` | No | Category tag for organisation |
208
+
209
+ ### 4.4 TypeScript vs ESM
210
+
211
+ - `.ts` files require `tsx` (installed via devDependencies) — the generator shells out to `npx tsx` to evaluate them
212
+ - `.mjs` files are imported directly — slightly faster, no build step
213
+ - Both work identically; use `.ts` for consistency with the rest of the project
214
+
215
+ ---
216
+
217
+ ## 5. Brand Alignment
218
+
219
+ All generated assets must follow `config/brand-assets.yaml`:
220
+
221
+ ### 5.1 Illustration Style
222
+
223
+ - **Monochromatic editorial hand-drawn** (NYT-style)
224
+ - Pen-and-ink technique: cross-hatching, stippling
225
+ - No colour, no gradients, no digital effects
226
+ - Restrained and institutional mood
227
+
228
+ ### 5.2 Colour Palette
229
+
230
+ | Colour | Hex | Usage |
231
+ |---|---|---|
232
+ | Near-black | `#1C1917` | Primary line work |
233
+ | Off-white | `#FAFAF9` | Background/negative space |
234
+ | Warm gray | `#78716C` | Secondary elements |
235
+ | Light gray | `#D6D3D1` | Tertiary/halftone areas |
236
+ | Medium gray | `#A8A29E` | Mid-tones |
237
+
238
+ ### 5.3 What to Avoid
239
+
240
+ - Cartoon or comic styles
241
+ - Isometric SaaS/tech aesthetics
242
+ - Stock photography look
243
+ - Bright colours or saturated palettes
244
+ - Clip art or flat design icons
245
+
246
+ ---
247
+
248
+ ## 6. Output & Logging
249
+
250
+ ### 6.1 Output Directories
251
+
252
+ | Type | Path |
253
+ |---|---|
254
+ | Illustrations | `public/generated/illustrations/<slug>.png` |
255
+ | Diagrams | `public/generated/diagrams/<slug>.png` |
256
+ | Videos | `public/generated/videos/<slug>.mp4` |
257
+
258
+ ### 6.2 Generation Log
259
+
260
+ Every generation is recorded in `scripts/media-generation/generation-log.json`:
261
+
262
+ ```json
263
+ [
264
+ {
265
+ "slug": "board-pack-cover-q2-2026",
266
+ "type": "illustration",
267
+ "model": "gemini-3-pro-image-preview",
268
+ "generatedAt": "2026-04-09T14:30:00.000Z",
269
+ "outputPath": "public/generated/illustrations/board-pack-cover-q2-2026.png"
270
+ }
271
+ ]
272
+ ```
273
+
274
+ ### 6.3 Rate Limiting
275
+
276
+ A 15-second delay is enforced between consecutive API requests to avoid rate limits. For batch generation (`--all-missing`), this means ~4 images per minute.
277
+
278
+ ---
279
+
280
+ ## 7. Testing
281
+
282
+ | # | Test | How to Verify |
283
+ |---|---|---|
284
+ | 1 | API key valid | `node -e "require('dotenv').config(); console.log(process.env.GEMINI_API_KEY?.slice(0,8))"` |
285
+ | 2 | List specs | `node scripts/media-generation/generate-assets.mjs --list` |
286
+ | 3 | Generate image | Generate one illustration and verify output in `public/generated/` |
287
+ | 4 | Generate video | Generate one video (takes several minutes) |
288
+ | 5 | Brand alignment | Verify output matches monochromatic editorial style |
289
+ | 6 | Batch generation | `--all-missing` — should skip already-generated assets |
290
+ | 7 | Generation log | Check `generation-log.json` for entries |
291
+
292
+ ---
293
+
294
+ ## 8. Troubleshooting
295
+
296
+ ### "GEMINI_API_KEY not set"
297
+
298
+ 1. Check `.env` file exists and contains the key
299
+ 2. Verify key format (should be a long alphanumeric string)
300
+ 3. Get a key from https://aistudio.google.com/apikey
301
+
302
+ ### Rate limit errors
303
+
304
+ 1. The 15-second delay should prevent most rate limits
305
+ 2. If hit, wait 60 seconds and retry
306
+ 3. For heavy batch generation, consider running overnight
307
+ 4. Check your API quota at https://aistudio.google.com/
308
+
309
+ ### "Prompt spec not found"
310
+
311
+ 1. Check the slug matches a file in `scripts/media-generation/prompts/illustrations/` (or `diagrams/`, `videos/`)
312
+ 2. File must end in `.ts` or `.mjs`
313
+ 3. List available specs: `node scripts/media-generation/generate-assets.mjs --list`
314
+
315
+ ### Image quality issues
316
+
317
+ 1. Review the `style` field in the prompt spec — be more specific about the desired aesthetic
318
+ 2. Add negative prompts ("no cartoon", "no digital effects")
319
+ 3. Try a different model (Imagen may produce different results than Gemini)
320
+ 4. Adjust aspect ratio to match the intended use
321
+
322
+ ### Video generation timeout
323
+
324
+ 1. Default timeout is 10 minutes — video generation can be slow
325
+ 2. Check the Veo API status if consistently timing out
326
+ 3. Try shorter duration (4s instead of 8s) for faster generation
327
+ 4. Lower resolution (720p) generates faster than 1080p/4k
328
+
329
+ ---
330
+
331
+ ## Key Files
332
+
333
+ | File | Purpose |
334
+ |---|---|
335
+ | `scripts/media-generation/generate-assets.mjs` | Main generation orchestrator |
336
+ | `scripts/media-generation/gemini-image-client.mjs` | Google Gemini/Imagen image API client |
337
+ | `scripts/media-generation/veo-video-client.mjs` | Veo 3.1 video API client |
338
+ | `scripts/media-generation/prompts/` | Prompt spec directory (illustrations, diagrams, videos) |
339
+ | `scripts/media-generation/generation-log.json` | Generation history log |
340
+ | `public/generated/` | Output directory for all generated assets |
341
+ | `config/brand-assets.yaml` | Brand style guide (colours, typography, illustration style) |
342
+
343
+ ---
344
+
345
+ ## Related Documents
346
+
347
+ - [PDF Generation Setup](pdf-generation-setup.md) — Using generated assets in branded PDFs
348
+ - [Agent Persona Setup](agent-persona-setup.md) — Brand configuration
349
+ - [Mac Mini Bootstrap](../runbooks/mac-mini-bootstrap.md) — Node.js and dependency installation