@scenerok/cli 1.0.10 → 1.0.12
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json
CHANGED
package/skills/shared/SKILL.md
CHANGED
|
@@ -80,29 +80,54 @@ output to "video.mp4", resolution: "1080x1920", fps: 30
|
|
|
80
80
|
### Package Imports
|
|
81
81
|
|
|
82
82
|
```vidscript
|
|
83
|
-
import xai from "@scenerok/xai"
|
|
84
|
-
import
|
|
85
|
-
import
|
|
83
|
+
import xai from "@scenerok/xai" # xAI image/video/tts
|
|
84
|
+
import cf from "@scenerok/cloudflare" # Cloudflare AI Gateway image/video models
|
|
85
|
+
import eleven from "@elevenlabs/music" # ElevenLabs generative music
|
|
86
|
+
import motion from "@scenerok/basic-animations" # text/video animation helpers
|
|
86
87
|
```
|
|
87
88
|
|
|
88
89
|
### Plugin Calls
|
|
89
90
|
|
|
90
91
|
```vidscript
|
|
91
92
|
import xai from "@scenerok/xai"
|
|
93
|
+
import cf from "@scenerok/cloudflare"
|
|
92
94
|
import eleven from "@elevenlabs/music"
|
|
93
95
|
import motion from "@scenerok/basic-animations"
|
|
94
96
|
|
|
95
|
-
[-] = video xai.
|
|
97
|
+
[- 4s] = video xai.image("Premium product still on seamless studio background, no text, no words, no letters, no captions, no logos, no watermark, no readable UI copy", aspect_ratio: "9:16", resolution: "2k")
|
|
98
|
+
[-] = video xai.imagine("Cinematic product video, premium lighting, no text, no words, no letters, no captions, no logos, no watermark, no readable UI copy", aspect_ratio: "9:16", duration: 5)
|
|
99
|
+
[-] = video xai.imageToVideo("https://cdn.example.com/product.png", "Slow premium camera move around the product, no text, no words, no letters, no captions, no logos, no watermark, no readable UI copy", aspect_ratio: "9:16", duration: 6)
|
|
100
|
+
[-] = video xai.referenceToVideo(["https://cdn.example.com/product.png", "https://cdn.example.com/person.png"], "Lifestyle ad shot featuring the referenced product and person, no text, no words, no letters, no captions, no logos, no watermark, no readable UI copy", aspect_ratio: "9:16", duration: 8)
|
|
101
|
+
[-] = video xai.extendVideo("https://cdn.example.com/clip.mp4", "Camera pulls back to reveal the full premium setup, no text, no words, no letters, no captions, no logos, no watermark, no readable UI copy", duration: 5)
|
|
96
102
|
[-] = audio xai.tts("Welcome to SceneRok", voice: "eve")
|
|
97
103
|
|
|
104
|
+
# Cloudflare gives access to multiple image/video providers through one package.
|
|
105
|
+
[-] = video cf.video("High-energy fashion product shot, handheld camera feel, no text, no words, no letters, no captions, no logos, no watermark, no readable UI copy", model: "pixverse/v6", aspect_ratio: "9:16", duration: 5, generate_audio: false)
|
|
106
|
+
[-] = video cf.imageToVideo("https://cdn.example.com/product.png", "Animate the product with realistic parallax and soft studio lighting, no text, no words, no letters, no captions, no logos, no watermark, no readable UI copy", model: "runwayml/gen-4.5", aspect_ratio: "9:16", duration: 5)
|
|
107
|
+
[- 4s] = video cf.image("Clean editorial product still, no text, no words, no letters, no captions, no logos, no watermark, no readable UI copy", model: "black-forest-labs/flux-2-pro-preview", aspect_ratio: "9:16")
|
|
108
|
+
|
|
98
109
|
# ElevenLabs music package functions require an import alias.
|
|
99
110
|
let bed = eleven.music("Warm premium launch bed", duration: 15, instrumental: true)
|
|
100
111
|
[0s .. 15s] = audio bed, volume: 0.35, fade_out: 2s
|
|
101
112
|
```
|
|
102
113
|
|
|
103
|
-
**Always** import the package that owns the function: `xai.*` from `@scenerok/xai`, `eleven.*` from `@elevenlabs/music`, and `motion.*` from `@scenerok/basic-animations`. Calls without import fail validation with `Unknown function 'xai.imagine'`.
|
|
114
|
+
**Always** import the package that owns the function: `xai.*` from `@scenerok/xai`, `cf.*` from `@scenerok/cloudflare`, `eleven.*` from `@elevenlabs/music`, and `motion.*` from `@scenerok/basic-animations`. Calls without import fail validation with `Unknown function 'xai.imagine'`.
|
|
115
|
+
|
|
116
|
+
Use the full visual toolkit, not only `xai.imagine()`. Generate a still first with `xai.image(...)`, `xai.generateImage(...)`, `cf.image(...)`, or `cf.generateImage(...)` when a clean key visual is enough. Use `xai.imageToVideo(image, prompt, ...)` or `cf.imageToVideo(image, prompt, ...)` when you have one strong product/screenshot/logo image to animate. Use `xai.referenceToVideo([images...], prompt, ...)` or `cf.referenceToVideo([images...], prompt, ...)` when extracted objects, screenshots, product photos, people, packaging, or brand elements should guide the generated scene. Use `xai.extendVideo(...)` or `cf.extendVideo(...)` when an existing generated/user clip should continue.
|
|
117
|
+
|
|
118
|
+
Available plugin APIs:
|
|
119
|
+
|
|
120
|
+
| Package | Functions | Typical use |
|
|
121
|
+
|---------|-----------|-------------|
|
|
122
|
+
| `@scenerok/xai` | `image`, `generateImage`, `imagine`, `genVideo`, `imageToVideo`, `referenceToVideo`, `extendVideo` | Grok stills, text-to-video, image/reference-guided video, clip extension |
|
|
123
|
+
| `@scenerok/xai` | `tts`, `textToSpeech`, `listVoices` | Spoken narration and voice lookup |
|
|
124
|
+
| `@scenerok/cloudflare` | `image`, `generateImage`, `imagine`, `genVideo`, `generateVideo`, `video`, `imageToVideo`, `referenceToVideo`, `extendVideo`, `listModels`, `models` | Alternate providers/models through Cloudflare AI Gateway, including PixVerse, Vidu, Runway, MiniMax, Seedance, Veo, FLUX, GPT Image, Recraft |
|
|
125
|
+
| `@elevenlabs/music` | `music`, `generateMusic`, `composeMusic` | Background music beds, theme music, instrumental loops |
|
|
126
|
+
| `@scenerok/basic-animations` | `fadeIn`, `fadeOut`, `slideX`, `slideY`, `popIn`, `riseIn`, `swingIn`, `glitchIn`, `float`, `typewriter` | Motion descriptors for `animate:` |
|
|
127
|
+
|
|
128
|
+
xAI TTS voice IDs: `eve` for demos/announcements/upbeat content, `ara` for warm conversational narration, `rex` for business/tutorial delivery, `sal` for balanced general narration, and `leo` for authoritative instructional narration.
|
|
104
129
|
|
|
105
|
-
For ElevenLabs music, import `@elevenlabs/music` and call `eleven.music(...)`, `eleven.generateMusic(...)`, or `eleven.composeMusic(...)`. Use `let bed = eleven.music(...)` followed by `audio bed, volume: ...` when you need volume or fades.
|
|
130
|
+
For ElevenLabs music, import `@elevenlabs/music` and call `eleven.music(...)`, `eleven.generateMusic(...)`, or `eleven.composeMusic(...)`. Use `let bed = eleven.music(...)` followed by `audio bed, volume: ...` when you need volume or fades. Do not call `eleven.tts(...)`; this repo currently exposes voiceover through `xai.tts(...)` / `xai.textToSpeech(...)`. Place voiceover as an `audio` block and do not leave ads silent unless the user asks for silent output.
|
|
106
131
|
|
|
107
132
|
Read `vidscript-strict.md` for anti-patterns and `examples/system/*.vid` for copy-paste templates (installed with `scenerok skills install`).
|
|
108
133
|
|
|
@@ -120,11 +145,12 @@ input hero = "{{hero_clip}}"
|
|
|
120
145
|
2. **Plan the structure** — Time blocks, durations, inputs
|
|
121
146
|
3. **Gather assets** — Video URLs, local paths, generated clips, and selective website screenshots/images; when the user provides a URL, browser screenshots are optional source material, not a requirement to build the whole video from screenshots
|
|
122
147
|
4. **Probe asset type and dimensions** — Inspect every local or downloaded asset before placing it. Use `file` to identify image vs video, `ffprobe` for video/audio, `sips -g pixelWidth -g pixelHeight file` on macOS images, or `identify file` when ImageMagick is available.
|
|
123
|
-
5. **
|
|
124
|
-
6. **
|
|
125
|
-
7. **
|
|
126
|
-
8. **
|
|
127
|
-
9. **
|
|
148
|
+
5. **Use images directly when useful** — VidScript `input` supports both videos and images. You can place a still image input with `video image_input`; the renderer detects the asset type and treats it as a static visual clip for the block duration.
|
|
149
|
+
6. **Choose plugin modality deliberately** — still image, text-to-video, image-to-video, reference-to-video, video extension, TTS, and music are separate tools; use the one that matches the assets and story beat
|
|
150
|
+
7. **Compose VidScript** — Write the full script using measured dimensions, aspect-ratio-preserving scale, centered/safe-area placement, entry/exit animation beats, and audio blocks when the video needs narration or a music bed
|
|
151
|
+
8. **Validate** — Run `scenerok validate script.vid`
|
|
152
|
+
9. **Render** — Run `scenerok render script.vid --watch`
|
|
153
|
+
10. **Deliver** — Share the download URL when complete
|
|
128
154
|
|
|
129
155
|
## Website URL Asset Workflow
|
|
130
156
|
|
|
@@ -135,11 +161,11 @@ When the user provides a product, company, landing page, app store listing, or e
|
|
|
135
161
|
3. Crop or frame screenshots to the specific visual evidence needed. Do not use screenshots as the only visual style by default, and do not build a whole video as a sequence of unreadable full-page captures.
|
|
136
162
|
4. Detect and save usable product images, app screenshots, brand marks, and logos from the page when available. Prefer direct image assets when they are accessible and clearly match the product; use browser screenshots as supporting assets when they communicate something visible at video size.
|
|
137
163
|
5. Probe saved asset type and dimensions before composing. Record file type, width, height, aspect ratio, and duration if video; never guess dimensions from filenames or screenshots.
|
|
138
|
-
6.
|
|
139
|
-
7. Use captured assets
|
|
164
|
+
6. Use saved still images directly as timeline inputs when the real asset should appear on screen, or feed the strongest stills into `xai.imageToVideo` / `xai.referenceToVideo` when motion would make the output stronger. Keep screenshots focused and readable.
|
|
165
|
+
7. Use captured assets as grounded proof points alongside other visual material. A good ad can combine product/logo/app screenshots, generated video clips, motion backgrounds, text primitives, and music.
|
|
140
166
|
8. Scale assets from their measured aspect ratio to fit the output frame and safe areas. For 1080x1920 ads, keep primary visuals inside roughly 80-88% of frame width and reserve enough top/bottom room for text.
|
|
141
|
-
9. Animate visual assets intentionally: use entry animations such as `motion.popIn`, `motion.riseIn`, or `motion.slideY`; use short exit fade/slide segments when a clean transition is needed.
|
|
142
|
-
10. Generative visuals
|
|
167
|
+
9. Animate visual assets intentionally: use entry animations such as `motion.popIn`, `motion.riseIn`, or `motion.slideY`; use short exit fade/slide segments when a clean transition is needed.
|
|
168
|
+
10. Generative visuals are encouraged when useful. Choose among still generation (`xai.image`, `xai.generateImage`, `cf.image`, `cf.generateImage`), text-to-video (`xai.imagine`, `xai.genVideo`, `cf.video`, `cf.generateVideo`), image-to-video (`xai.imageToVideo`, `cf.imageToVideo`), reference-to-video (`xai.referenceToVideo`, `cf.referenceToVideo`), and clip extension (`xai.extendVideo`, `cf.extendVideo`).
|
|
143
169
|
11. For every generated visual prompt, explicitly require a clean scene with no text, no words, no letters, no captions, no logos, no watermarks, and no readable UI copy. AI video generation often corrupts text; all final titles, offers, captions, CTAs, prices, and labels must be created with VidScript `text` primitives.
|
|
144
170
|
12. Do not invent product claims. Extract copy, pricing, feature names, social proof, and CTA language from the site or ask the user.
|
|
145
171
|
|
|
@@ -151,7 +177,7 @@ When the user provides a product, company, landing page, app store listing, or e
|
|
|
151
177
|
- **Use website assets judiciously for URL-based ads** — browser screenshots, product images, app screenshots, and logos can ground the ad, but they are optional ingredients, not the whole recipe
|
|
152
178
|
- **Avoid full-page screenshot backgrounds** — use above-fold, cropped, or focused screenshots that remain legible at video size; never rely on long website screenshots where the text becomes unreadable
|
|
153
179
|
- **Probe before placing** — get exact dimensions for every real asset, calculate scale from aspect ratio, and set `width`, `height`, `x`, and `y` explicitly
|
|
154
|
-
- **Match asset type to placement** — use real
|
|
180
|
+
- **Match asset type to placement** — use real videos for motion footage and image inputs for still product shots, screenshots, logos, and brand marks. The `video` primitive can place either media type.
|
|
155
181
|
- **Animate assets, not only text** — give product screenshots/logos/cards a clear entrance and a clean exit; split static visuals into main and fade-out segments if needed
|
|
156
182
|
- **Keep generated media text-free** — prompts for AI-generated images/videos must explicitly say no text, no words, no letters, no captions, no logos, no watermarks, and no readable UI copy; add all final copy with VidScript `text` primitives
|
|
157
183
|
- Use 1080x1920 for vertical content (TikTok/Instagram)
|
|
@@ -159,6 +185,7 @@ When the user provides a product, company, landing page, app store listing, or e
|
|
|
159
185
|
- Hook viewers in the first 3 seconds
|
|
160
186
|
- Use high-contrast text on video backgrounds with `stroke` and `stroke_width`
|
|
161
187
|
- Include a clear call-to-action
|
|
188
|
+
- Treat filters as optional finishing tools. Use `vignette`, `saturation`, `contrast`, etc. only when they improve the composition; do not add them by default.
|
|
162
189
|
- Test with `scenerok validate` before rendering
|
|
163
190
|
- Each render costs 1 credit
|
|
164
191
|
|
|
@@ -184,9 +211,9 @@ If a render fails:
|
|
|
184
211
|
|
|
185
212
|
## Sample Video Types
|
|
186
213
|
|
|
187
|
-
- **Product Promo** — Hero clip + headline + description + CTA
|
|
214
|
+
- **Product Promo** — Hero clip or generated visual + headline + description + CTA
|
|
188
215
|
- **Social Media Hook** — Fast cuts, bold text, speed adjustments
|
|
189
|
-
- **Testimonial** — Speaker clip + quote text +
|
|
216
|
+
- **Testimonial** — Speaker clip + quote text + optional styling/filter pass
|
|
190
217
|
- **Meme Remix** — Reaction clip + top/bottom punchline overlays
|
|
191
218
|
- **Title Sequence** — Background clip + glitch shader + bold typography
|
|
192
219
|
|
|
@@ -4,22 +4,24 @@ VidScript is a declarative DSL for composing short-form videos. Write a script,
|
|
|
4
4
|
|
|
5
5
|
## Agent quick start (read before composing)
|
|
6
6
|
|
|
7
|
-
1. **Import every package function** — use `import xai from "@scenerok/xai"`, `import eleven from "@elevenlabs/music"`, and `import motion from "@scenerok/basic-animations"` before calling `xai.*`, `eleven.*`, or `motion.*`.
|
|
7
|
+
1. **Import every package function** — use `import xai from "@scenerok/xai"`, `import cf from "@scenerok/cloudflare"`, `import eleven from "@elevenlabs/music"`, and `import motion from "@scenerok/basic-animations"` before calling `xai.*`, `cf.*`, `eleven.*`, or `motion.*`.
|
|
8
8
|
2. **Validate early** — `scenerok validate file.vid` (errors show as `Line N:C — message`).
|
|
9
9
|
3. **Music package syntax** — `import eleven from "@elevenlabs/music"`, then call `eleven.music(...)`, `eleven.generateMusic(...)`, or `eleven.composeMusic(...)`.
|
|
10
10
|
4. **Copy working examples** — see `examples/system/` in this skill folder (e.g. `minimal-text-reel.vid`, `ecom-product-launch.vid`, `rokmilk-chocolate-promo.vid`).
|
|
11
11
|
5. **Use URL assets selectively** — if the user gives a website or product URL and browser screenshot tools are available, you can capture above-fold screenshots, product images, app screenshots, and logos for VidScript `input` assets. Do not treat screenshots as mandatory or as the only visual source.
|
|
12
12
|
6. **Probe asset type and dimensions** — before placing real assets, identify image vs video with `file`; inspect width, height, and duration with `ffprobe`, `sips`, or `identify`; then scale from the measured aspect ratio and set explicit `width`, `height`, `x`, and `y`.
|
|
13
|
-
7. **Use still images
|
|
14
|
-
8. **
|
|
15
|
-
9. **
|
|
16
|
-
10. **
|
|
13
|
+
7. **Use still images directly** — `input` supports videos and images. Place still screenshots, logos, and product photos with `video image_input`; the renderer treats images as static clips for the block duration.
|
|
14
|
+
8. **Use the full media plugin toolkit** — choose still generation, text-to-video, image-to-video, reference-to-video, video extension, TTS, and music deliberately. Do not default to only `xai.imagine`.
|
|
15
|
+
9. **Include audio when useful** — use `xai.tts` / `xai.textToSpeech` for narration and `eleven.music` / `eleven.generateMusic` / `eleven.composeMusic` for music beds.
|
|
16
|
+
10. **Generated media has no text** — prompts for AI-generated images/videos must explicitly ask for no text, no words, no letters, no captions, no logos, no watermarks, and no readable UI copy. Use VidScript `text` primitives for all final copy.
|
|
17
|
+
11. **Strict rules** — see `vidscript-strict.md` for invalid patterns (bare `xai.imagine`, bare `cf.video(...)`, bare `eleven.music(...)` without import, trailing params after direct plugin calls, nested quotes in prompts).
|
|
17
18
|
|
|
18
19
|
### Common validation failures
|
|
19
20
|
|
|
20
21
|
| Symptom | Fix |
|
|
21
22
|
|---------|-----|
|
|
22
23
|
| `Unknown function 'xai.imagine'` | Add `import xai from "@scenerok/xai"` at top |
|
|
24
|
+
| `Unknown function 'cf.video'` | Add `import cf from "@scenerok/cloudflare"` at top |
|
|
23
25
|
| `Unknown function 'fadeIn'` | Use `import motion from "@scenerok/basic-animations"` and call `motion.fadeIn(...)` |
|
|
24
26
|
| `Expected ... but "," found` | Do not put `, volume:` after a direct plugin call. Use `let bed = eleven.music(...)`, then `audio bed, volume: ...` |
|
|
25
27
|
| `Unknown function 'eleven.generateMusic'` | Add `import eleven from "@elevenlabs/music"` before calling `eleven.generateMusic(...)` |
|
|
@@ -59,20 +61,14 @@ sips -g pixelWidth -g pixelHeight asset.png
|
|
|
59
61
|
identify asset.webp
|
|
60
62
|
```
|
|
61
63
|
|
|
62
|
-
Use real videos as
|
|
63
|
-
|
|
64
|
-
```bash
|
|
65
|
-
ffmpeg -y -loop 1 -t 4.5 -i screenshot.png -vf "scale=ceil(iw/2)*2:ceil(ih/2)*2,format=yuv420p" -r 30 screenshot-plate.mp4
|
|
66
|
-
```
|
|
67
|
-
|
|
68
|
-
Then declare the plate:
|
|
64
|
+
Use real videos as motion footage and still images as static clips. The `video` primitive accepts both; the renderer detects image vs video:
|
|
69
65
|
|
|
70
66
|
```vidscript
|
|
71
|
-
input screenshot = "assets/
|
|
67
|
+
input screenshot = "assets/screenshots/homepage.png"
|
|
72
68
|
[0s .. 4.5s] = video screenshot, width: 1080, height: 1920, x: 0, y: 0, animate: motion.fadeIn(0.35s)
|
|
73
69
|
```
|
|
74
70
|
|
|
75
|
-
|
|
71
|
+
For URL-based ads, still assets can also drive generated motion: use `xai.imageToVideo(screenshot, "...")` for one strong source image, or `xai.referenceToVideo([product, logo, person], "...")` to guide a new scene with extracted references.
|
|
76
72
|
|
|
77
73
|
Use the measured width and height to preserve aspect ratio. For a 1080x1920 vertical ad, a common centered fit is:
|
|
78
74
|
|
|
@@ -118,6 +114,7 @@ Only default package imports are supported for callable packages. Call functions
|
|
|
118
114
|
|
|
119
115
|
```vidscript
|
|
120
116
|
import xai from "@scenerok/xai"
|
|
117
|
+
import cf from "@scenerok/cloudflare"
|
|
121
118
|
import eleven from "@elevenlabs/music"
|
|
122
119
|
import motion from "@scenerok/basic-animations"
|
|
123
120
|
```
|
|
@@ -274,12 +271,22 @@ Audio sources can be MP3 inputs, audio plugin results, or the embedded audio str
|
|
|
274
271
|
|
|
275
272
|
```vidscript
|
|
276
273
|
import xai from "@scenerok/xai"
|
|
274
|
+
import cf from "@scenerok/cloudflare"
|
|
277
275
|
import eleven from "@elevenlabs/music"
|
|
278
276
|
import motion from "@scenerok/basic-animations"
|
|
279
277
|
|
|
280
|
-
[-] = video xai.
|
|
278
|
+
[- 4s] = video xai.image("Premium product still on seamless studio background, no text, no words, no letters, no captions, no logos, no watermark, no readable UI copy", aspect_ratio: "9:16", resolution: "2k")
|
|
279
|
+
[-] = video xai.imagine("Cinematic product video, premium lighting, no text, no words, no letters, no captions, no logos, no watermark, no readable UI copy", aspect_ratio: "9:16", resolution: "720p", duration: 5)
|
|
280
|
+
[-] = video xai.imageToVideo("https://cdn.example.com/product.png", "Slow premium camera move around the product, no text, no words, no letters, no captions, no logos, no watermark, no readable UI copy", aspect_ratio: "9:16", duration: 6)
|
|
281
|
+
[-] = video xai.referenceToVideo(["https://cdn.example.com/product.png", "https://cdn.example.com/person.png"], "Lifestyle ad shot featuring the referenced product and person, no text, no words, no letters, no captions, no logos, no watermark, no readable UI copy", aspect_ratio: "9:16", duration: 8)
|
|
282
|
+
[-] = video xai.extendVideo("https://cdn.example.com/clip.mp4", "Camera pulls back to reveal the full premium setup, no text, no words, no letters, no captions, no logos, no watermark, no readable UI copy", duration: 5)
|
|
281
283
|
[-] = audio xai.tts("Welcome to SceneRok", voice: "eve")
|
|
282
284
|
|
|
285
|
+
# Cloudflare AI Gateway package.
|
|
286
|
+
[-] = video cf.video("High-energy fashion product shot, no text, no words, no letters, no captions, no logos, no watermark, no readable UI copy", model: "pixverse/v6", aspect_ratio: "9:16", duration: 5, generate_audio: false)
|
|
287
|
+
[-] = video cf.imageToVideo("https://cdn.example.com/product.png", "Animate the product with realistic parallax, no text, no words, no letters, no captions, no logos, no watermark, no readable UI copy", model: "runwayml/gen-4.5", aspect_ratio: "9:16", duration: 5)
|
|
288
|
+
[- 4s] = video cf.image("Clean editorial product still, no text, no words, no letters, no captions, no logos, no watermark, no readable UI copy", model: "black-forest-labs/flux-2-pro-preview", aspect_ratio: "9:16")
|
|
289
|
+
|
|
283
290
|
# ElevenLabs music package functions require an import alias.
|
|
284
291
|
[0s .. 15s] = audio eleven.music("Warm premium launch bed", duration: 15, instrumental: true)
|
|
285
292
|
|
|
@@ -288,13 +295,37 @@ let bed = eleven.music("Warm premium launch bed", duration: 15, instrumental: tr
|
|
|
288
295
|
[0s .. 15s] = audio bed, volume: 0.35, fade_out: 2s
|
|
289
296
|
```
|
|
290
297
|
|
|
291
|
-
`@scenerok/xai`
|
|
298
|
+
`@scenerok/xai`, `@scenerok/cloudflare`, and `@elevenlabs/music` require explicit default imports. ElevenLabs music validates as `eleven.music(...)`, `eleven.generateMusic(...)`, or `eleven.composeMusic(...)` after `import eleven from "@elevenlabs/music"`. Plugin calls run at compile time; direct media plugin instructions cannot have trailing audio params, so use a `let` binding when you need `volume`, `fade_in`, or `fade_out`.
|
|
299
|
+
|
|
300
|
+
### Media Plugin API Reference
|
|
301
|
+
|
|
302
|
+
| Package | Function | Use when |
|
|
303
|
+
|---------|----------|----------|
|
|
304
|
+
| `@scenerok/xai` | `xai.image(prompt, ...)`, `xai.generateImage(prompt, ...)` | Generate a still image clip. Use `resolution: "1k"` or `"2k"` and `duration` for how long the still stays on the timeline. |
|
|
305
|
+
| `@scenerok/xai` | `xai.imagine(prompt, ...)`, `xai.genVideo(prompt, ...)` | Generate video from text. Use video resolutions such as `"480p"` or `"720p"`. |
|
|
306
|
+
| `@scenerok/xai` | `xai.imageToVideo(image, prompt, ...)` | Animate one source image, screenshot, logo, product photo, data URI, file id, or `{ url }` object. |
|
|
307
|
+
| `@scenerok/xai` | `xai.referenceToVideo([images...], prompt, ...)` | Guide a generated scene with 1-7 reference images. Requires a prompt; keep duration <= 10s. |
|
|
308
|
+
| `@scenerok/xai` | `xai.extendVideo(video, prompt, ...)` | Continue an existing video clip by up to 10s. |
|
|
309
|
+
| `@scenerok/xai` | `xai.tts(text, ...)`, `xai.textToSpeech(text, ...)` | Generate spoken narration. Voices: `eve`, `ara`, `rex`, `sal`, `leo`. |
|
|
310
|
+
| `@scenerok/cloudflare` | `cf.image(prompt, ...)`, `cf.generateImage(prompt, ...)` | Generate stills through Cloudflare AI Gateway image models such as `black-forest-labs/flux-2-pro-preview`, `openai/gpt-image-2`, `xai/grok-imagine-image-quality`, or `recraft/recraftv4-pro`. |
|
|
311
|
+
| `@scenerok/cloudflare` | `cf.video(prompt, ...)`, `cf.imagine(prompt, ...)`, `cf.genVideo(prompt, ...)`, `cf.generateVideo(prompt, ...)` | Generate text-to-video through models such as `pixverse/v6`, `vidu/q3-turbo`, `runwayml/gen-4.5`, `minimax/hailuo-2.3`, `bytedance/seedance-2.0`, or `google/veo-3.1`. |
|
|
312
|
+
| `@scenerok/cloudflare` | `cf.imageToVideo(image, prompt, ...)`, `cf.referenceToVideo([images...], prompt, ...)`, `cf.extendVideo(video, prompt, ...)` | Use Cloudflare-backed image-guided, reference-guided, or extension modes when the selected model supports them. |
|
|
313
|
+
| `@scenerok/cloudflare` | `cf.listModels()`, `cf.models()` | Return model metadata as data; useful for exploration, not for timeline media. |
|
|
314
|
+
| `@elevenlabs/music` | `eleven.music(prompt, ...)`, `eleven.generateMusic(prompt, ...)`, `eleven.composeMusic(prompt, ...)` | Generate background music. Use a `let` binding if you need `volume`, `fade_in`, or `fade_out`. |
|
|
315
|
+
|
|
316
|
+
xAI TTS voices: `eve` for upbeat demos/announcements, `ara` for warm conversational narration, `rex` for business/tutorial content, `sal` for balanced general delivery, and `leo` for authoritative instruction.
|
|
317
|
+
|
|
318
|
+
Do not call `eleven.tts(...)`; the registered ElevenLabs package is `@elevenlabs/music` for music beds only. Use `xai.tts(...)` or `xai.textToSpeech(...)` for voiceover.
|
|
319
|
+
|
|
320
|
+
Cloudflare video supports common named params such as `model`, `aspect_ratio`, `resolution`, `quality`, `duration`, `negative_prompt`, `seed`, `generate_audio`, `image`, `reference_images`, `reference_video`, and `input` for model-specific overrides. Prefer explicit `model:` when choosing Cloudflare so the intended provider is clear.
|
|
292
321
|
|
|
293
322
|
**Invalid (will not validate):**
|
|
294
323
|
|
|
295
324
|
```vidscript
|
|
296
325
|
[-] = video xai.imagine("...") # missing xAI import
|
|
326
|
+
[0s .. 5s] = video cf.video("...") # missing Cloudflare import
|
|
297
327
|
[0s .. 15s] = audio eleven.generateMusic("...") # missing `import eleven from "@elevenlabs/music"`
|
|
328
|
+
[0s .. 5s] = audio eleven.tts("...") # ElevenLabs TTS is not registered; use xai.tts
|
|
298
329
|
[0s .. 15s] = audio eleven.music("..."), volume: 0.5 # trailing params after direct plugin call
|
|
299
330
|
```
|
|
300
331
|
|
|
@@ -340,7 +371,7 @@ Compiler lowers `animate:` into `IRMotionTrack[]` (property-path keyframes). Leg
|
|
|
340
371
|
[0s .. 5s] = filter "glitch", intensity: 0.5, animate: motion.fadeIn(1s)
|
|
341
372
|
```
|
|
342
373
|
|
|
343
|
-
Built-in filters: `monochrome`, `sepia`, `blur`, `chromatic`, `glitch`, `vignette`, `contrast`, `saturation`, `brightness`.
|
|
374
|
+
Built-in filters: `monochrome`, `sepia`, `blur`, `chromatic`, `glitch`, `vignette`, `contrast`, `saturation`, `brightness`. Filters are optional finishing tools; use them only when they improve the composition.
|
|
344
375
|
|
|
345
376
|
### Animation functions (`@scenerok/basic-animations`)
|
|
346
377
|
|
|
@@ -22,8 +22,6 @@ input cta_text = "{{cta_text | visit your nearest grocery store}}"
|
|
|
22
22
|
|
|
23
23
|
[8s .. 11s] = text cta_text, font: "Bebas Neue", size: 48, color: "#FFD700", x: "50%", y: "80%", align: center, stroke: "#3E1C00", stroke_width: 4, animate: motion.fadeIn(0.8s)
|
|
24
24
|
|
|
25
|
-
[0s .. 12s] = filter "vignette", intensity: 0.3
|
|
26
|
-
|
|
27
25
|
output to "product-promo.mp4", resolution: "1080x1920", fps: 30
|
|
28
26
|
```
|
|
29
27
|
|
|
@@ -6,11 +6,12 @@ Follow these rules exactly. When in doubt, copy a file from `examples/system/` (
|
|
|
6
6
|
|
|
7
7
|
```vidscript
|
|
8
8
|
import xai from "@scenerok/xai"
|
|
9
|
+
import cf from "@scenerok/cloudflare"
|
|
9
10
|
import eleven from "@elevenlabs/music"
|
|
10
11
|
import motion from "@scenerok/basic-animations"
|
|
11
12
|
```
|
|
12
13
|
|
|
13
|
-
Without the owning import, calls like `xai.imagine`, `xai.tts`, `eleven.music`, and `motion.fadeIn` fail validation.
|
|
14
|
+
Without the owning import, calls like `xai.imagine`, `cf.video`, `xai.tts`, `eleven.music`, and `motion.fadeIn` fail validation.
|
|
14
15
|
|
|
15
16
|
**Never** call package functions without importing the package alias first.
|
|
16
17
|
|
|
@@ -20,7 +21,7 @@ When the user gives a product, company, ecommerce, landing page, or app URL and
|
|
|
20
21
|
|
|
21
22
|
Do not use full-page or very long website screenshots as video backgrounds; their text is usually illegible at output size. Crop or frame screenshots to the specific visual proof point, and use them sparingly.
|
|
22
23
|
|
|
23
|
-
Website screenshots are optional source material, not a requirement to build the whole video from screenshots. You may combine them with
|
|
24
|
+
Website screenshots are optional source material, not a requirement to build the whole video from screenshots. You may combine them with still generation, text-to-video, image-to-video, reference-to-video, or extension plugins for lifestyle scenes, motion backgrounds, transitions, or product context. Do not invent claims; use website copy or ask the user.
|
|
24
25
|
|
|
25
26
|
## Rule 2.5: Probe real asset type and dimensions before placement
|
|
26
27
|
|
|
@@ -33,18 +34,10 @@ sips -g pixelWidth -g pixelHeight asset.png
|
|
|
33
34
|
identify asset.webp
|
|
34
35
|
```
|
|
35
36
|
|
|
36
|
-
Use real video files for
|
|
37
|
-
|
|
38
|
-
For screenshots, logos, and other stills, convert the still to an H.264 MP4 plate first:
|
|
39
|
-
|
|
40
|
-
```bash
|
|
41
|
-
ffmpeg -y -loop 1 -t 4 -i screenshot.png -vf "scale=ceil(iw/2)*2:ceil(ih/2)*2,format=yuv420p" -r 30 screenshot-plate.mp4
|
|
42
|
-
```
|
|
43
|
-
|
|
44
|
-
Then use the plate:
|
|
37
|
+
Use real video files for motion footage and still image inputs for screenshots, logos, product photos, and brand marks. The `video` primitive accepts both videos and images; image inputs render as static clips for the block duration.
|
|
45
38
|
|
|
46
39
|
```vidscript
|
|
47
|
-
input screenshot = "assets/
|
|
40
|
+
input screenshot = "assets/screenshots/homepage.png"
|
|
48
41
|
[0s .. 4s] = video screenshot, width: 1080, height: 1920, x: 0, y: 0, animate: motion.fadeIn(0.35s)
|
|
49
42
|
```
|
|
50
43
|
|
|
@@ -81,14 +74,38 @@ Invalid: asking the generator for signs, labels, packaging text, app UI text, ti
|
|
|
81
74
|
|
|
82
75
|
```vidscript
|
|
83
76
|
import xai from "@scenerok/xai"
|
|
77
|
+
import cf from "@scenerok/cloudflare"
|
|
84
78
|
import eleven from "@elevenlabs/music"
|
|
85
79
|
|
|
80
|
+
[- 4s] = video xai.image("Prompt here, no text, no words, no letters, no captions, no logos, no watermark, no readable UI copy", aspect_ratio: "9:16", resolution: "2k")
|
|
86
81
|
[-] = video xai.imagine("Prompt here, no text, no words, no letters, no captions, no logos, no watermark, no readable UI copy", aspect_ratio: "9:16", duration: 5)
|
|
82
|
+
[-] = video xai.imageToVideo("https://cdn.example.com/product.png", "Slow camera move around the product, no text, no words, no letters, no captions, no logos, no watermark, no readable UI copy", aspect_ratio: "9:16", duration: 6)
|
|
83
|
+
[-] = video xai.referenceToVideo(["https://cdn.example.com/product.png", "https://cdn.example.com/person.png"], "Lifestyle scene using the referenced product and person, no text, no words, no letters, no captions, no logos, no watermark, no readable UI copy", aspect_ratio: "9:16", duration: 8)
|
|
84
|
+
[-] = video xai.extendVideo("https://cdn.example.com/clip.mp4", "Continue the camera move, no text, no words, no letters, no captions, no logos, no watermark, no readable UI copy", duration: 5)
|
|
87
85
|
[-] = audio xai.tts("Voiceover line", voice: "eve")
|
|
86
|
+
|
|
87
|
+
[-] = video cf.video("Prompt here, no text, no words, no letters, no captions, no logos, no watermark, no readable UI copy", model: "pixverse/v6", aspect_ratio: "9:16", duration: 5, generate_audio: false)
|
|
88
|
+
[-] = video cf.imageToVideo("https://cdn.example.com/product.png", "Animate the image, no text, no words, no letters, no captions, no logos, no watermark, no readable UI copy", model: "runwayml/gen-4.5", aspect_ratio: "9:16", duration: 5)
|
|
88
89
|
[0s .. 15s] = audio eleven.music("Warm premium music bed", duration: 15, instrumental: true)
|
|
89
90
|
```
|
|
90
91
|
|
|
91
|
-
|
|
92
|
+
Do not overuse one plugin function. Choose deliberately:
|
|
93
|
+
|
|
94
|
+
| Need | Use |
|
|
95
|
+
|------|-----|
|
|
96
|
+
| Still image clip | `xai.image`, `xai.generateImage`, `cf.image`, `cf.generateImage` |
|
|
97
|
+
| Text-to-video | `xai.imagine`, `xai.genVideo`, `cf.video`, `cf.imagine`, `cf.generateVideo` |
|
|
98
|
+
| Animate one image/screenshot/product photo | `xai.imageToVideo`, `cf.imageToVideo` |
|
|
99
|
+
| Guide with 1-7 reference images | `xai.referenceToVideo`, `cf.referenceToVideo` |
|
|
100
|
+
| Continue a clip | `xai.extendVideo`, `cf.extendVideo` |
|
|
101
|
+
| Voiceover | `xai.tts`, `xai.textToSpeech` |
|
|
102
|
+
| Music bed | `eleven.music`, `eleven.generateMusic`, `eleven.composeMusic` |
|
|
103
|
+
|
|
104
|
+
Use `imageToVideo` for one extracted screenshot/product/logo image. Use `referenceToVideo` for 1-7 reference images when extracted objects should guide the generated shot. xAI `referenceToVideo` requires a prompt and duration must be <= 10s. Prefer explicit `model:` for Cloudflare calls, e.g. `pixverse/v6`, `vidu/q3-turbo`, `runwayml/gen-4.5`, `minimax/hailuo-2.3`, `bytedance/seedance-2.0`, `google/veo-3.1`, `black-forest-labs/flux-2-pro-preview`, or `openai/gpt-image-2`.
|
|
105
|
+
|
|
106
|
+
xAI TTS voices are only `eve`, `ara`, `rex`, `sal`, and `leo`.
|
|
107
|
+
|
|
108
|
+
For ElevenLabs music, import `@elevenlabs/music` and use `eleven.music(...)`, `eleven.generateMusic(...)`, or `eleven.composeMusic(...)`. Do not call `eleven.tts(...)`; ElevenLabs TTS is not registered in VidScript today. Ads, promos, tutorials, and reels should normally include either `xai.tts` voiceover, an ElevenLabs music bed, or both unless the user asks for silent output.
|
|
92
109
|
|
|
93
110
|
## Rule 5: Time ranges use `..`
|
|
94
111
|
|
|
@@ -112,6 +129,7 @@ Use only straight `"` inside `xai.imagine("...")`. Do not nest quotes inside the
|
|
|
112
129
|
|
|
113
130
|
```vidscript
|
|
114
131
|
import xai from "@scenerok/xai"
|
|
132
|
+
import cf from "@scenerok/cloudflare"
|
|
115
133
|
import eleven from "@elevenlabs/music"
|
|
116
134
|
|
|
117
135
|
[-] = audio xai.tts("Spoken line", voice: "eve", speed: 1.0)
|
|
@@ -128,6 +146,10 @@ let bed = eleven.music("Soft premium ad music", duration: 15, instrumental: true
|
|
|
128
146
|
|
|
129
147
|
**Invalid:** `eleven.generateMusic(...)` without `import eleven from "@elevenlabs/music"`.
|
|
130
148
|
|
|
149
|
+
**Invalid:** `eleven.tts(...)` — not a registered function; use `xai.tts(...)`.
|
|
150
|
+
|
|
151
|
+
**Invalid:** `cf.video(...)` without `import cf from "@scenerok/cloudflare"`.
|
|
152
|
+
|
|
131
153
|
## Rule 9: Filters
|
|
132
154
|
|
|
133
155
|
```vidscript
|
|
@@ -135,7 +157,7 @@ let bed = eleven.music("Soft premium ad music", duration: 15, instrumental: true
|
|
|
135
157
|
[0s .. 15s] = filter "saturation", value: 1.12
|
|
136
158
|
```
|
|
137
159
|
|
|
138
|
-
`saturation` uses `value`, not `intensity`.
|
|
160
|
+
Filters are optional finishing tools, not required ingredients. Use them only when they improve the visual result. `saturation` uses `value`, not `intensity`.
|
|
139
161
|
|
|
140
162
|
## Rule 10: Always end with output
|
|
141
163
|
|