npm - @kolbo/kolbo-code-linux-arm64-musl - Versions diffs - 2.2.4 → 2.3.0 - Mend

@kolbo/kolbo-code-linux-arm64-musl 2.2.4 → 2.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (26) hide show

package/bin/kolbo +0 -0
package/package.json +1 -1
package/skills/kolbo/SKILL.md +177 -1651
package/skills/kolbo/VERSION +1 -0
package/skills/kolbo/references/models/creative-director.md +106 -0
package/skills/kolbo/references/models/gpt-image.md +111 -0
package/skills/kolbo/references/models/html-presentation.md +139 -0
package/skills/kolbo/references/models/landing-page.md +135 -0
package/skills/kolbo/references/models/music.md +120 -0
package/skills/kolbo/references/models/nano-banana.md +97 -0
package/skills/kolbo/references/models/prompt-copilot.md +133 -0
package/skills/kolbo/references/models/seedance.md +90 -0
package/skills/kolbo/references/models/veo.md +110 -0
package/skills/kolbo/references/models/visual-code.md +80 -0
package/skills/kolbo/references/workflows/app-builder.md +41 -0
package/skills/kolbo/references/workflows/cost-and-validation.md +138 -0
package/skills/kolbo/references/workflows/dtc-ads.md +126 -0
package/skills/kolbo/references/workflows/marketing-studio.md +157 -0
package/skills/kolbo/references/workflows/marketplace-cards.md +146 -0
package/skills/kolbo/references/workflows/media-library.md +76 -0
package/skills/kolbo/references/workflows/product-photoshoot.md +199 -0
package/skills/kolbo/references/workflows/production-log.md +155 -0
package/skills/kolbo/references/workflows/research-first.md +174 -0
package/skills/kolbo/references/workflows/transcription.md +163 -0
package/skills/kolbo/references/workflows/troubleshooting.md +73 -0
package/skills/kolbo/references/workflows/visual-dna.md +233 -0

package/skills/kolbo/references/workflows/cost-and-validation.md ADDED Viewed

@@ -0,0 +1,138 @@
+# Cost Awareness, Validation & Constraints
+Load this file when you need to: confirm cost before firing a generation, validate input params against a model's caps, or quote real cost after a generation completes.
+## Billing Units by Type
+Creative generations bill against the user's Kolbo credit balance. **Billing units differ by type** — apply the correct formula before generating.
+| Type | Billing unit | Credit range | Example |
+|------|-------------|-------------|---------|
+| **Image** | per image (flat) | 1–30 cr | Flux.1 Fast = 1 cr, Midjourney = 4 cr. If `resolution` is set, check `resolutionMultipliers` — some families multiply cost significantly at higher tiers. |
+| **Image edit** | per image (flat) | 2–20 cr | |
+| **Video** | **cr/s × duration** | 2–30 cr/s | Kandinsky 5 Fast × 5s = 10 cr; Seedance 2.0 × 10s = 300 cr. Check `resolutionMultipliers` + `soundCreditMultiplier`. |
+| **Video from image** | **cr/s × duration** | 4–30 cr/s | Same per-second rule. |
+| **Elements (ref-to-video)** | **cr/s × duration** | 4–30 cr/s | Check `credit` and multipliers in `list_models type="elements"`. |
+| **Lipsync** | **cr/s × duration** | 5–20 cr/s | |
+| **Music** | per generation (flat) | 15–60 cr | Suno v5 = 15 cr; ElevenLabs Music = 60 cr |
+| **Speech (TTS)** | per 100 characters | 2–5 cr/100 chars | ElevenLabs (5) × 500 chars = 25 cr |
+| **Sound effects** | per generation (flat) | 4–7 cr | |
+| **3D model** | per model (flat) | 5–300 cr | Trellis = 5 cr; Meshy v6 = 150 cr; Marble 1.1 = 300 cr |
+| **Transcription (stt)** | per minute of audio | `model.credit × duration_minutes` | |
+## Calculation Formulas
+Apply when confirming cost before firing:
+- **Video / Lipsync**: `total = model_credit_per_second × duration_seconds`. Never assume the credit shown is a flat per-generation cost for these types.
+- **Music**: flat per generation — `total = model_credit` (duration does not change cost).
+- **TTS**: `total = model_credit × ceil(character_count / 100)`. Count actual characters first. 1000 chars with ElevenLabs = 50 credits.
+- **Images / 3D / Sound effects**: `total = model_credit × quantity`.
+- **Resolution / audio multipliers**: if `resolution` is set or model has native audio, read `resolutionMultipliers[tier]` and `soundCreditMultiplier`. Formula: `final = base × resolutionMult × (sound ? soundMult : 1) × durationSeconds`.
+### Tier label → pixel mapping (rough)
+- Images: `"1K"` ≈ 1024px, `"2K"` ≈ Full HD (1920×1080), `"3K"` ≈ QHD (2560×1440), `"4K"` ≈ UHD (3840×2160). Picker shows only tiers the model supports (per `supported_resolutions`).
+- Videos: `"720p"` / `"1080p"` / `"1440p"` / `"2160p"` = vertical pixels. Some models use model-specific labels like `"512P"` / `"1024P"` (Hailuo).
+## When to Confirm Cost
+**Skip cost confirmation when:**
+- The user already specified model + count + duration ("make 5 videos, seedance 2 fast, 15s" IS the confirmation).
+- A single generation costs under 5 credits.
+**Required cost confirmation when:**
+- Anything else — present a one-line summary: "8 videos × 5s × [model] @ X cr/s = **Y credits**. Proceed?"
+- Suggest a cheaper alternative if one exists.
+- Wait for the user's confirm before firing.
+**Batch totalling 100+ credits:** run `check_credits` first and include the available balance in the summary.
+## ⚠️ Quote Real Cost, Never Estimates (CRITICAL)
+Pre-flight formulas above are for **preview only**. After firing, every generation returns `credits_used` (multiplier-adjusted total) and `credits_breakdown` (per-model attribution).
+```json
+{
+  "credits_used": 12,
+  "credits_breakdown": [
+    { "model": "nano-banana-2", "base": 8, "final": 12, ... }
+  ],
+  "urls": [...]
+}
+```
+**Log `credits_used` to `.kolbo/production.md`**, not `base × count`. The multiplier-adjusted number is the only truth.
+When the user asks "how much did I spend?" → call `get_session_usage` for the real, multiplier-adjusted session total + per-tool + per-model breakdowns (same numbers as the desktop bottom-bar counter).
+## Validation Pattern — Every Generation
+Before submitting:
+1. Call `list_models type=<tool-type>` (text mode is enough for picking; `format: "json"` for programmatic comparison).
+2. For each input array (refs / DNAs / elements) — check `length <= <cap>` from the canonical field reference below. If over, drop the lowest-priority entries OR ask the user.
+3. For each enumerated value (`aspect_ratio` / `resolution` / `duration`) — check it's in `supported_*`. If not, **do not silently substitute**; show the user the allowed set and ask.
+4. For each duration-bearing file (source_video for lipsync/v2v, audio for lipsync/elements) — pre-check duration against the min/max range. Use ffmpeg if needed.
+5. For uploads — pre-check size against `max_file_size`.
+The MCP tool descriptions also embed the cap field name on the relevant parameter (e.g. `reference_images: "...Cap: pass at most max_reference_images..."`) — use those as inline reminders.
+## Canonical Field Reference — Which `list_models` Field Controls Which Input
+The same conceptual slot (e.g. "max reference images") lives under **different field names per model family**. Read the row for your tool, not the model name.
+| Your input | Tool(s) | Field on the model | What `0` / `null` means |
+|---|---|---|---|
+| `reference_images` | `generate_image`, `generate_image_edit` (uses `source_images`), `generate_creative_director`, `generate_video` | `max_reference_images` | `0` = no refs |
+| `reference_images` | `generate_elements` | `elements_max_images` | `0` = no image refs |
+| `reference_images` | `generate_video_from_video` | `max_images` | `0` = no secondary image input |
+| `reference_videos` | `generate_elements` | `elements_max_videos` | `0` = no video refs |
+| `reference_videos` | `generate_video_from_video` | `max_videos` | `<= 1` = only the source_video |
+| `elements` | `generate_video_from_video` | `max_elements` | `0` = no elements |
+| `audio_url` | `generate_elements` | `elements_max_audio` (+ `max_audio_duration` for the file) | `0` = no audio ref |
+| `visual_dna_ids` | every DNA-aware tool | `max_visual_dna` (+ `supports_visual_dna` boolean) | `null` / `0` / `false` = model rejects DNA |
+| `aspect_ratio` | any | `supported_aspect_ratios` (or `_by_type[<type>]` when multimodal) | empty → `default_aspect_ratio` if set |
+| `resolution` | any | `supported_resolutions` (+ `resolution_multipliers` for cost) | empty → no resolution tiering |
+| `duration` (video output) | video tools | `supported_durations`, else `min_output_duration`–`max_output_duration` | both null → omit and let server default |
+| **input** video duration | `lipsync-video`, `generate_video_from_video` | `min_video_duration` – `max_video_duration` | outside range → reject |
+| input audio duration | `generate_lipsync`, `generate_elements` audio | `min_audio_duration` – `max_audio_duration` (+ `audio_max_follows_video_duration` for lipsync) | outside range → reject |
+| audio file format | any audio input | `supported_audio_formats` (e.g. `["mp3","wav","m4a"]`; empty = all) | pre-validate before upload |
+| recording duration | `text_to_speech` recording UX | `min_recording_duration` – `max_recording_duration` | usually null for plain TTS |
+| upload file size | every file upload | `max_file_size` (bytes) | null → use platform default |
+| `num_images` | image tools | `images_per_request` overrides for fixed-output models (Midjourney returns 4) | null → `num_images` honored as-is |
+| `prompt` | every tool | `requires_prompt`, `min_prompt_length`, `max_prompt_length` | null → unconstrained |
+| sound on/off | video tools | `sound_generation_type` (`"native"` vs `"none"`), `sound_enabled_by_default`, `sound_credit_multiplier` | not `"native"` → can't emit synced audio |
+| capability gate | route decision | `supports_visual_dna`, `supports_first_last_frame`, `supports_audio_input` | `false` → the controller silently drops that param |
+Cost formula: `final_cost = credit × resolution_multipliers[resolution] × (sound_enabled ? sound_credit_multiplier : 1)`, multiplied by `num_images` / `scene_count` as applicable.
+## Decision Rule for Resolution
+1. **User specified resolution explicitly** ("4K", "1080p", "480p") → ALWAYS verify in `supported_resolutions` BEFORE firing. If not supported:
+   - ❌ Do **NOT** silently substitute. The user asked for 480p; sending 720p without consent burns 1.5–2× the credits they expected.
+   - ✅ Show them the supported set in one line and ask:
+     > "Seedance 2 elements supports `[720p, 1080p, 1440p, 2160p]` — 480p isn't available. Closest cheap option is 720p (~+0 credits over your intent). Want 720p, or pick another?"
+   - Only fire after they reply.
+2. **User specified quality intent without numbers** ("draft", "quick test", "final delivery", "for client", "production"):
+   - draft / quick / preview → cheapest in `supported_resolutions` (1K / 720p)
+   - normal / standard → middle tier (typically 2K / 1080p)
+   - final / production / hero → highest the user's budget allows (3K-4K / 1440p-2160p)
+3. **No quality signal AND cost difference >2×** OR total batch ≥4 outputs → **ask the user once** with a one-line cost comparison, then default to standard if they don't reply.
+4. **No quality signal AND cost difference ≤1.5×** → quietly use the cheapest supported, no need to interrupt.
+5. **Sound on a video model with `sound_credit_multiplier > 1`** → if user didn't ask for sound, leave it off. If user said "with sound" / "with music", enable it.
+## Defaults When Nothing Is Specified
+- **Image**: `1K` (or the cheapest in `supported_resolutions`).
+- **Video**: `720p` (or the cheapest), with `default_duration` (or shortest in `supported_durations`).
+- **Sound**: respect `sound_enabled_by_default`; if false, leave off.
+## Always Log the Resolution / Duration / Sound Choices
+Production-log entries should include the resolution and (for video) duration + sound state alongside the URL, so the user can see what they paid for:
+```md
+- still: https://...01-coffee.png  (flux-2-pro · 1K, 2026-05-14)
+- video: https://...02-rain.mp4   (kling-2 · 1080p · 5s · sound-off, 2026-05-14)
+```

package/skills/kolbo/references/workflows/dtc-ads.md ADDED Viewed

@@ -0,0 +1,126 @@
+# DTC Ads — Composed Brand Image Workflow
+Load this file when the user wants a **DTC ad image** composed from brand identity + ad format + optional avatar/product/reference media. For ad **video** see `workflows/marketing-studio.md`. For brand **product imagery** (Pinterest pin, hero banner, ad pack) see `workflows/product-photoshoot.md`. For marketplace listings see `workflows/marketplace-cards.md`.
+## What This Is
+A DTC ad is built from **5 composable blocks**:
+1. **Ad format** — the structural template (headline-driven, bullet-points, us-vs-them, before-after, founder-statement, etc.). Defines the layout shape.
+2. **Brand kit** *(optional)* — palette, fonts, logo, tone, voice. Keeps every ad in a campaign visually consistent.
+3. **Avatar** *(optional)* — a presenter face (curated character or trained Visual DNA). Use when the brand has a specific founder, model, or recurring presenter.
+4. **Product** *(optional)* — the item being sold. One product image, or a product brief from a URL.
+5. **Reference media** *(optional)* — up to ~14 reference images to anchor style / composition / setting.
+You don't need all 5. The minimum is: a **prompt** + an **ad format**. Everything else is opt-in based on what the user provides.
+## End-to-End Flow
+```
+1. Pick an ad format     → ask user (labeled options, never auto-pick)
+2. Pick / build brand kit → workflows/research-first.md persists to .kolbo/brand-kits/<slug>.md
+3. Attach avatar          → workflows/visual-dna.md ("character" type DNA)
+4. Attach product         → upload_media → reference_images
+5. Attach reference media → upload_media → reference_images (up to ~14 total)
+6. Generate               → generate_creative_director (multi-variant) or generate_image (single)
+7. Deliver                → image URLs + brief one-line summary
+```
+## Ad Format — Always Ask Explicitly
+Picking an ad format is **mandatory and creative** — don't auto-pick from the user's phrasing. The catalogue is small and the choice changes the layout shape dramatically. Always present labeled options:
+| Format type | Examples |
+|---|---|
+| **Headline-driven** | Big hero phrase + small product. "Hero word" style. |
+| **Bullet points** | 3–5 benefit bullets + product hero. SaaS / DTC standard. |
+| **Us vs Them** | Side-by-side comparison column. Competitor takedown style. |
+| **Before / After** | Split frame showing transformation. Great for skincare, fitness, home. |
+| **Founder statement** | Founder portrait + quote + product. Trust-builder. |
+| **Lifestyle hero** | Product in-use in an aspirational scene. No copy hero. |
+| **Pure product** | Clean studio product shot with brand framing. |
+| **Testimonial** | Customer quote + face + product. Social proof. |
+| **Pattern interrupt** | Bold color block / typographic shock / surreal composition. Scroll-stopper. |
+When the user says "make me an ad" without naming a format, offer 3 of these in a labeled question (don't dump all 9). Pick the 3 that best fit the product / brand / phase the user mentioned.
+## Brand Kit Reuse
+If `.kolbo/brand-kits/<slug>.md` exists for the brand (see `workflows/research-first.md`), **Read it first** and pull `primary_color`, `accent_color`, `text_color`, `bg_color`, `fonts`, `tone`, `target_user`, `logo_url`. Bake these into the prompt:
+- Exact hex codes for every color (`#FF4D2E` not "orange")
+- Named fonts (`Inter Bold for headline, Inter Regular for body`)
+- Tone descriptors from `### Voice & Audience`
+- Logo as `reference_images[0]` with `@image1` reference in the prompt ("place logo from `@image1` top-left at 8% width, no recolor")
+If no brand kit exists and the user gives a brand URL, run `workflows/research-first.md` to build one. Then come back here.
+## Avatar Workflow
+For ads featuring a specific presenter (founder, recurring model, character):
+1. **Check if a Visual DNA exists** — `list_visual_dnas`. Match by name or recent use.
+2. **If yes** — pass `visual_dna_ids: ["<id>"]` and reference as `@<dna-name>` in the prompt.
+3. **If no** and the user wants a specific person — create one per `workflows/visual-dna.md` (always generate 2 reference images first; lock single-token lowercase name).
+4. **If no** and the brief doesn't need a specific face — skip the avatar entirely; the model will synthesize a plausible presenter.
+## Product Workflow
+For ads featuring a specific product:
+| User provides | Do |
+|---|---|
+| **Product photo** (local file or URL) | `upload_media({ source })` → tag as `@image1` in prompt → log to `.kolbo/production.md` under `### Products` |
+| **Product URL only** (no photo) | Run `workflows/research-first.md` first to scrape hero images + brand palette; re-host via `upload_media` → use Kolbo CDN URL |
+| **Multiple angles** | Upload all in parallel (one `upload_media` call each) → pass all in `reference_images` → tag `@image1`, `@image2`, … per `workflows/visual-dna.md` reference-tagging rules |
+| **Nothing — text only** | Ask once: "Do you have a product photo? It dramatically improves fidelity." If they say no, proceed text-only but warn quality may be lower |
+**Always log products in `.kolbo/production.md`** so subsequent ads in the same workspace reuse the same CDN URL without re-uploading.
+## Reference Media Cap
+Up to **~14 reference images per call**. Higher = the model gets confused about which reference plays which role. Use **`@image1` / `@image2` / …** tags to bind each reference to a role:
+```
+Headline ad with @maya (the founder) holding @image1 (the product),
+shot in the style of @image2 (lifestyle reference).
+Match the palette from the brand kit (#FF4D2E primary, #1A1A1A text).
+```
+See `workflows/visual-dna.md` for the full tagging system.
+## Generate
+**Pick the right Kolbo MCP tool based on output count:**
+- **Single ad image** → `generate_image` with `model: "<from list_models>"`. Use Nano Banana 2 for character/lifestyle, GPT Image 2 for layouts with dense on-image text or infographics, Nano Banana Pro for hero/brand-final assets.
+- **Multi-variant set** (3–8 variants of the same ad concept with different palettes / angles / hooks) → `generate_creative_director` with `scene_count`. The director plans each variant's prompt internally.
+- **Identical prompt, just different seeds** (rare for ads — usually you want varied direction) → `generate_image` with `num_images: 1–4`.
+## Output Settings — Always Confirm
+These materially change output and cost. Ask once, labeled options, before firing:
+| Setting | Common options for ads |
+|---|---|
+| `aspect_ratio` | `1:1` (IG feed) / `9:16` (Reels / TikTok / Stories) / `4:5` (IG portrait) / `16:9` (YouTube, banners) / `1.91:1` (Facebook feed) |
+| `resolution` | `1K` (drafts, fast iteration) / `2K` (standard delivery) / `4K` (hero / print) |
+| Quantity | `1` (test) / `3–4` (variant exploration) / `8` (full ad pack via Creative Director) |
+Default-to-cheapest when the user hasn't expressed a quality intent and the difference is ≤ 2× cost.
+## Failure Handling
+- **Content-policy refusal** → don't retry the same prompt. Suggest less-explicit phrasing or a different product framing.
+- **Brand asset not loading** (logo URL 404, hex code typo) → fix the brand kit file, then retry.
+- **Watermarks / extra text appearing uninvited** → add explicit prompt constraints: "NO captions, NO subtitles, NO watermarks, NO extra text beyond what's specified." This is the most common DTC ad failure mode — models love to invent copy.
+- **Generic 5xx / rate-limit** → retry ONCE with the same payload after a short pause. See SKILL.md "Detecting failed generations".
+## UX Rules
+1. **Always pick an ad format explicitly** with the user — never auto-pick.
+2. **Always confirm aspect ratio + resolution + quantity** before firing.
+3. **Always check for a brand kit** before scraping fresh — `Read .kolbo/brand-kits/<slug>.md` first.
+4. **Always log products + brand kits in `.kolbo/production.md`** so future ads reuse instead of re-uploading / re-scraping.
+5. **No auto-retry on failure** — surface the reason and let the user adjust.
+6. **Strict NO uninvited additions** in every ad prompt: "NO captions, NO subtitles, NO watermarks, NO extra text beyond what's specified."

package/skills/kolbo/references/workflows/marketing-studio.md ADDED Viewed

@@ -0,0 +1,157 @@
+# Marketing Studio — UGC, Ads & Branded Video
+Load this file when the user wants **branded ad video** — UGC, unboxing, product showcase, TV spot, virtual try-on, or any "make me an ad / commercial / creator video" request.
+For ad **images** (Pinterest pin, hero banner, ad creative pack) see `workflows/product-photoshoot.md`.
+For **marketplace listings** (Amazon main + secondary + A+ content) see `workflows/marketplace-cards.md`.
+For the **DTC ads engine flow** (brand kit + ad format + avatar + product) see `workflows/dtc-ads.md`.
+## The 9 Marketing Modes
+| Mode | What it's for | Hook/Setting allowed? |
+|---|---|:-:|
+| `ugc` | **Default.** Casual, organic-feel content from a presenter | ✅ |
+| `ugc_how_to` | Tutorial / explainer — "here's how to use this" | ✅ |
+| `ugc_unboxing` | Unboxing reveal — "just got this in the mail" | ✅ |
+| `product_showcase` | Clean product highlight, polished | ❌ |
+| `product_review` | Presenter giving an opinion on the product | ✅ |
+| `tv_spot` | Broadcast-style commercial, higher production | ❌ |
+| `wild_card` | Experimental — model picks the vibe | ❌ |
+| `ugc_virtual_try_on` | Person trying on clothing / accessories — UGC vibe | ✅ |
+| `virtual_try_on` | Same but polished, model-driven | ❌ |
+**"Hook/Setting allowed"** = whether reusable opening hook prompts and scene-setting prompts can be prepended to the user prompt. Polished modes (`product_showcase`, `tv_spot`, `wild_card`, `virtual_try_on`) ignore hooks/settings.
+**Default when the user doesn't specify a mode:** `ugc`.
+## Picking the Mode
+| User phrasing | Mode |
+|---|---|
+| "UGC", "creator video", "talking head", "phone-shot", "selfie video", "vlogger" | `ugc` |
+| "tutorial", "how to use", "demonstrate", "walkthrough", "explainer" | `ugc_how_to` |
+| "unboxing", "just got this", "reveal", "first impression" | `ugc_unboxing` |
+| "product showcase", "highlight reel", "showroom" | `product_showcase` |
+| "review", "my take on", "comparing X to Y", "honest opinion" | `product_review` |
+| "TV ad", "commercial", "broadcast", "polished ad spot" | `tv_spot` |
+| "surprise me", "something different", "experimental" | `wild_card` |
+| "try on" / "wearing the X" + organic vibe | `ugc_virtual_try_on` |
+| "fashion shoot", "lookbook", polished try-on | `virtual_try_on` |
+If the user mentions a product / brand but no mode word, default to `ugc`. If they say "ad" without "TV ad" / "commercial" / "broadcast", default to `ugc` (most modern ads are UGC-shaped).
+## Mode → Kolbo MCP Routing
+The mode determines which Kolbo MCP tool to call, what defaults to set, and what's forbidden.
+| Mode | Primary tool | aspect_ratio | duration | sound_enabled | Captions / watermarks |
+|---|---|---|---|:-:|:-:|
+| `ugc`, `ugc_how_to`, `ugc_unboxing`, `ugc_virtual_try_on`, `product_review` | `generate_video_from_image` (frame-first) OR `generate_elements` (Visual DNA → video) | **`9:16`** | model's `default_duration` (5–8s) | OFF | **Never add** |
+| `product_showcase` | `generate_creative_director` with `workflow_type: "video"` (for multi-shot) OR `generate_video` (single) | `16:9` or `1:1` | 5–10s | ON if model supports `sound_generation_type: "native"` | Allowed if user asks |
+| `tv_spot` | `generate_creative_director` with `workflow_type: "video"` (3–6 shots for a beat structure) | `16:9` | 15–30s total | ON (full audio + dialogue) | Allowed if part of the spot |
+| `virtual_try_on` | `generate_elements` with character Visual DNA + product as `reference_images` | `9:16` or `4:5` | 5–8s | OFF | Never add |
+| `wild_card` | User's chosen model with broader prompt latitude (no mode-specific defaults) | User's pick | User's pick | User's pick | User's pick |
+**Pick the actual model** with `list_models({ type: "..." })` and validate caps before firing — see SKILL.md "Resolution / Aspect / Duration — validate against caps".
+## UGC Family Defaults (CRITICAL)
+When ANY `ugc*` mode is selected, snap to these unless the user explicitly overrides:
+| Setting | UGC default | Why |
+|---|---|---|
+| `aspect_ratio` | `9:16` | TikTok / Reels / Shorts are vertical-first |
+| Visual aesthetic | Phone-shot, handheld, natural lighting | UGC works because it doesn't look produced |
+| Camera language | Slight handheld sway, selfie-arm framing, key light from window/screen | NOT slow dollies, NOT crane moves, NOT studio key |
+| Energy | "Talking to a friend" — casual, direct-to-camera, occasional gestures | Not theatrical, not staged |
+| **Captions / subtitles / text overlays** | **NEVER add** unless explicitly requested | Users add captions in CapCut / native editor; baked-in captions limit reuse |
+| **Brand watermarks / lower-thirds / banners** | **NEVER add** unless explicitly requested | Same reason |
+| Music / SFX | OFF by default unless asked | They'll layer their own audio in post |
+| Length | Model's `default_duration` (typically 5–8s) | Shorter = more usable for the algorithm |
+**Phrases that activate UGC defaults:** "UGC", "user-generated", "creator video", "TikTok", "Reels", "Shorts", "POV", "selfie video", "phone-shot", "vlogger", "talking head" (when context implies social media), "for social", "Instagram video", "YouTube short".
+**Phrases that OVERRIDE UGC defaults** (use them as-given, not as UGC): "commercial", "ad spot" (without UGC), "cinematic", "broadcast", "TV ad", "horizontal", "16:9", "landscape", "billboard". When the user uses one of these, switch to `product_showcase` or `tv_spot` mode.
+## Hooks & Settings (concept)
+Hooks and settings are **reusable opening angles / scene contexts** that get prepended to the user's prompt. Kolbo does not yet expose these as first-class MCP primitives, but the concept is portable:
+- **Hook** = the opening line / angle of the ad (the first 1–2 seconds that earn the scroll). Example hooks: "POV: you just discovered X", "Why I stopped buying Y", "3 reasons this X is worth it", "Watch this before you buy a Y".
+- **Setting** = the scene/environment context. Example settings: "in a bright minimalist kitchen", "walking in a busy city street", "on a yoga mat at golden hour".
+**When the user asks for an ad and doesn't specify the opening**, offer 2–3 hook options (one-liner each) in a labeled-question style — never freeform "what hook?" Same for setting if the brief is location-agnostic.
+**Whitelist rule:** hooks/settings only make sense for `ugc`, `ugc_how_to`, `ugc_unboxing`, `product_review`, `ugc_virtual_try_on`. For `product_showcase`, `tv_spot`, `wild_card`, `virtual_try_on` — skip hooks/settings; those modes are concept-driven not hook-driven.
+**Mutually exclusive with ad references** (next section). Pick one path per generation.
+## Ad References (modeling new ads after existing ones)
+Sometimes the user has a reference ad they want to model the new ad after — their own previous winning ad, a competitor's ad, or a viral video. Kolbo path:
+1. **Upload the reference video** via `upload_media` (returns CDN URL).
+2. **Pass it as `reference_videos`** to `generate_elements`, OR as `source_video` to `generate_video_from_video` (if you want to actually restyle / re-shoot the reference).
+3. **Describe in the prompt** what to preserve from the reference (`@video1`'s pacing / camera move / lighting / cut rhythm) and what to change (subject / product / setting).
+4. **Tag with `@video1`** per `workflows/visual-dna.md` reference-tagging rules.
+**Mutually exclusive with hooks/settings** — pick one composition path per generation. Either reference-driven (use `@video1`) or composed-from-blocks (hook + setting + product). Mixing produces muddled output.
+## Avatars (= Visual DNA characters)
+What other platforms call "preset avatars" or "custom avatars" Kolbo calls **Visual DNA characters**. Two ways to get one:
+- **Existing character** — use `list_visual_dnas` to find one the user has already created.
+- **New character** — create with `create_visual_dna({ type: "character", name, images: [...] })`. See `workflows/visual-dna.md` for the full creation flow (pre-flight, naming rule, generate-reference-images-first).
+**For UGC modes:** an avatar is optional if the brief clearly mentions a person (the model can synthesize one). Pass `visual_dna_ids` when the user wants a *specific* presenter — their face, the brand founder, a previously trained character.
+**Always use `@<dna-name>` in the prompt** when passing `visual_dna_ids` — see `workflows/visual-dna.md` `@name` rules.
+## Products (image upload + reference)
+For ads that feature a specific product:
+1. **Upload product photo** via `upload_media` → Kolbo CDN URL.
+2. **Pass as `reference_images`** to `generate_creative_director` / `generate_elements` / `generate_video_from_image`.
+3. **Tag with `@image1`** in the prompt.
+4. **Log in `.kolbo/production.md`** under a `### Products` subsection so future ads in the same workspace reuse the same CDN URL (don't re-upload).
+If the user gives a **product URL** instead of a photo, see `workflows/research-first.md` — scrape, extract images, re-host via `upload_media`, persist as a brand kit at `.kolbo/brand-kits/<slug>.md`.
+## UX Rules
+1. **Always pick a mode explicitly.** Don't auto-pick from one ambiguous word. If the user said "make me an ad" with no other signal, offer labeled options: `[UGC / TV Spot / Product Showcase / Surprise me]`.
+2. **Always confirm aspect ratio + duration + sound** before firing — these materially change output and cost. One question, labeled options.
+3. **Default UGC settings are hard rules** — captions OFF, music OFF, watermarks OFF — even when the user doesn't mention them. Only flip when they ask.
+4. **No auto-retry on failure.** If the generation fails (content policy, model OOM), surface the reason and let the user adjust prompt or product.
+5. **Show results without dumping URLs** — see SKILL.md "Generated URLs in chat".
+## Prompt Template Seed for UGC
+```
+UGC selfie video, vertical 9:16, handheld phone aesthetic.
+{presenter description or @<dna-name>} in {everyday setting},
+{energy level: relaxed | enthusiastic | curious | reactive}.
+They {natural action with the product/subject},
+talking directly to camera.
+Phone-shot lighting (window/screen key light),
+slight handheld sway, no cinematic moves.
+Style: authentic creator content, NOT polished commercial.
+Sound: ambient room tone only, no music, no SFX overlay.
+```
+## Prompt Template Seed for TV Spot
+```
+3-shot broadcast commercial, cinematic 16:9.
+Shot 1 [0–5s] — {establishing hook}: {wide angle subject + camera move}, {lighting}, {tone setter}.
+Shot 2 [5–15s] — {product reveal / demo}: {medium shot with product in focus}, {practical action}, {emotional beat}.
+Shot 3 [15–25s] — {payoff + CTA}: {close-up or pull-back}, {brand line in dialogue or SFX}, {final hold}.
+Style: {brand mood — e.g., warm + premium / clean + modern / bold + youthful}.
+Audio: full mix — dialogue + score + SFX. Music: {genre/tempo}.
+```
+(Run via `generate_creative_director` with `workflow_type: "video"`, `scene_count: 3`.)

package/skills/kolbo/references/workflows/marketplace-cards.md ADDED Viewed

@@ -0,0 +1,146 @@
+# Marketplace Cards — Amazon / Shopify / eBay Listings
+Load this file when the user wants **marketplace listing visuals** — main image, secondary product images, infographics, or A+ content modules for Amazon, Shopify, eBay, Etsy, Walmart, or similar.
+For generic brand product photography (Pinterest, hero banner, lifestyle, ad pack) see `workflows/product-photoshoot.md`. For ad video see `workflows/marketing-studio.md`. For composed DTC ads see `workflows/dtc-ads.md`.
+## What This Is
+Marketplace listings need a **specific, compliance-aware visual system** that's different from brand campaign imagery:
+- **Main image** — strict marketplace rules (typically pure white background, product fills 85% of frame, no text, no props, no shadows). This is the conversion-critical thumbnail.
+- **Secondary product images** — multi-angle, detail shots, lifestyle, "what's in the box". Show the product from every angle a shopper needs before clicking buy.
+- **A+ content / Enhanced Brand Content (Amazon)** — long-form modules below the fold: hero banner, pain-point grid, feature comparison, ingredients breakdown, efficacy proof, how-to-use steps, brand endorsement / founder story.
+## The 4 Bundle Scopes
+When the user asks for a common bundle, fire one call per scope:
+| Scope | Creates |
+|---|---|
+| `main` | 1 marketplace main image |
+| `product-images` | main image + 5 secondary images |
+| `aplus` | main image + 7 A+ content modules |
+| `full-set` | main image + 5 secondary + 7 A+ modules (13 assets total) |
+Use a **custom subset** of the asset list below when the user wants a non-standard combination (e.g. "just main + infographic + lifestyle").
+## The 13 Asset Types
+| Asset | Purpose | Aspect ratio | Model preference |
+|---|---|---|---|
+| `main_image` | Marketplace thumbnail — strict compliance: pure white bg, product fills 85% of frame, no text, no props | `1:1` | Nano Banana 2 (clean studio render) |
+| `infographic` | Feature callouts with text labels and product hero | `1:1` or `4:5` | **GPT Image 2** (dense on-image text) |
+| `multi_angle` | 4-up grid showing front / back / sides of product | `1:1` | Nano Banana 2 |
+| `detail_shot` | Macro shot of texture / material / mechanism | `1:1` | Nano Banana 2 |
+| `lifestyle` | Product in use in real environment | `1:1` or `4:5` | Nano Banana 2 |
+| `whats_in_box` | Flat-lay showing the product + accessories laid out neatly | `1:1` | Nano Banana 2 |
+| `aplus_hero_banner` | Wide A+ header — brand identity hit | `3:1` | GPT Image 2 |
+| `aplus_pain_points` | 3-up grid showing the problem this product solves | `16:9` | GPT Image 2 (text) |
+| `aplus_features` | 3-up or 4-up feature breakdown with labels | `16:9` | GPT Image 2 |
+| `aplus_ingredients` | Ingredients / materials breakdown (skincare, food, supplements) | `16:9` | GPT Image 2 |
+| `aplus_efficacy` | Before/after, % stats, clinical results — proof block | `16:9` | GPT Image 2 (charts + text) |
+| `aplus_how_to_use` | Numbered step-by-step usage instructions | `16:9` | GPT Image 2 |
+| `aplus_endorsement` | Founder story, brand mission, testimonial-style | `16:9` | Nano Banana 2 (people) + GPT Image 2 (text overlay) |
+## Kolbo MCP Routing
+For **bundles** (`product-images`, `aplus`, `full-set`): use `generate_creative_director` with `scene_count` = number of assets in the bundle. Pass the product image as `reference_images[0]` so it appears consistently across every asset. Each scene's prompt encodes one asset type.
+For **single `main` image** or **custom subset** of ≤ 2 assets: `generate_image` per asset, fired in parallel (single response, multiple tool calls).
+For **multi-angle** specifically: this is one image with a 4-up grid composition — use `generate_image` with a prompt describing the 2×2 layout, NOT `generate_creative_director`. (Or alternatively, fire 4 separate `generate_image` calls and composite the grid yourself — depends on user preference.)
+**Always pass the product photo** as `reference_images` for every call. `@image1` references it in the prompt. If the user gave a URL instead of a photo, run `workflows/research-first.md` first.
+## Main Image Compliance Rules (HARD)
+Different marketplaces have different rules. The **strictest is Amazon's**, which most other marketplaces follow:
+1. **Pure white background** (`#FFFFFF`, no gradients, no shadow tone).
+2. **Product fills ≥ 85% of the frame** — minimal margin.
+3. **NO text** — no logos baked in, no callouts, no "NEW" stickers, no watermarks.
+4. **NO props** — just the product. No hands, no models, no styling pieces.
+5. **NO multiple products** — single hero (variant grids go in secondary, not main).
+6. **NO color borders / decorative frames**.
+Bake these into every `main_image` prompt as explicit prohibitions:
+```
+Pure white background (#FFFFFF), seamless studio sweep.
+Product (@image1) centered, fills 85% of frame.
+Tack-sharp focus, no shadows on background, soft contact shadow only.
+NO text, NO logos, NO captions, NO props, NO models, NO decorative borders.
+Photographic, catalog-grade, neutral color.
+```
+## Secondary Image Strategy
+5-image standard set (when the user says `product-images` scope):
+1. **Lifestyle** — product in use, real environment
+2. **Detail / texture** — macro of the key material or feature
+3. **Multi-angle** — 4-up showing all sides
+4. **What's-in-the-box** — flat-lay of components
+5. **Scale / size reference** — product next to a hand or known-size object
+Adjust based on category: skincare needs ingredients close-up + texture-on-skin; apparel needs front + back + on-model + detail + size chart; electronics needs ports/buttons close-up + size comparison.
+## A+ Content Strategy
+A+ modules tell a story below the fold. Standard 7-module flow:
+1. `aplus_hero_banner` — brand identity / aspirational hero
+2. `aplus_pain_points` — what problem we solve
+3. `aplus_features` — how we solve it (3–4 differentiators)
+4. `aplus_ingredients` (skincare/food/supplements) OR materials/specs (electronics/apparel)
+5. `aplus_efficacy` — proof (before/after, % stats, third-party data)
+6. `aplus_how_to_use` — usage steps
+7. `aplus_endorsement` — founder story / mission / testimonial
+For dense-text modules (`aplus_features`, `aplus_pain_points`, `aplus_efficacy`, `aplus_how_to_use`): always recommend **GPT Image 2** at `resolution: "2K"` or `"4K"` (text needs the higher tier to stay sharp).
+## Brand Kit Reuse
+Check `.kolbo/brand-kits/<slug>.md` before generating. Pull `primary_color`, `accent_color`, `text_color`, `bg_color`, `fonts`. Bake into every A+ module prompt — marketplace pages live or die on visual consistency across the 13 assets.
+## Pre-Generation Interview
+Ask 2–3 short labeled questions before firing:
+1. **Which marketplace?** `[Amazon US / Amazon EU / Shopify / Etsy / eBay / Walmart / Other]` — affects compliance rules
+2. **Which bundle?** `[main / product-images / aplus / full-set / custom subset]`
+3. **Brand kit?** Auto-detect from `.kolbo/brand-kits/<slug>.md`; otherwise ask if brand colors / fonts should be applied
+Skip questions whose answer is obvious from the request.
+## Output Discipline
+- For bundles: `generate_creative_director` returns N URLs. Present them in chat as a numbered list, one URL per line, with the asset name as label:
+  ```
+  Marketplace cards ready:
+  1. Main image: https://...
+  2. Lifestyle: https://...
+  3. Detail shot: https://...
+  ...
+  ```
+  Do NOT wrap them in an HTML grid artifact — the canvas already shows the gallery.
+- Log all URLs to `.kolbo/production.md` under `## Production: <product name>` → `### Marketplace Cards` subsection.
+- If a `main_image` came back with text / props (compliance failure), surface the issue and re-fire with stronger prompt prohibitions — don't ship a non-compliant main image.
+## Existing Main-Image Reuse
+If the user already has an approved `main_image` from a prior session and wants to generate only secondary / A+ assets that match it:
+1. Look up the main image URL from `.kolbo/production.md`.
+2. Pass it as `reference_images[0]` (in addition to the product photo) so the new assets match the main's exact rendering style — same lighting, same color cast, same product orientation.
+3. Tag it as `@image2` in the prompt: "Match the product rendering from `@image2` exactly — same angle, same lighting, same color cast."
+## UX Rules
+1. **Always ask which marketplace** — compliance rules vary.
+2. **Strict compliance prompts on main_image** — explicit NO text / NO props / NO models / NO borders.
+3. **Always reuse brand kit** — Read `.kolbo/brand-kits/<slug>.md` first; pass palette + fonts to every A+ module.
+4. **Recommend GPT Image 2 + 2K/4K for dense-text A+ modules** — Nano Banana renders text well but GPT Image 2 wins at multi-line technical layouts.
+5. **For bundles, always use `generate_creative_director`** — never fire 13 parallel `generate_image` calls.
+6. **Log everything to `.kolbo/production.md`** — marketplace listings get updated quarterly; reuse beats regenerate.