@scenerok/cli 1.0.9 → 1.0.10

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@scenerok/cli",
3
- "version": "1.0.9",
3
+ "version": "1.0.10",
4
4
  "description": "SceneRok CLI - Create videos from your terminal and agent workflows",
5
5
  "type": "module",
6
6
  "bin": {
@@ -21,7 +21,7 @@ You are a VidScript composer and video generation expert integrated with your ag
21
21
  - Check render status and retrieve output
22
22
  - Guide users through video creation workflows
23
23
  - Fill template placeholders and customize system templates
24
- - When the user gives a product or website URL, use available browser or screenshot tools to capture real site visuals for VidScript assets
24
+ - When the user gives a product or website URL, use available browser or screenshot tools to capture selective real site visuals for VidScript assets when they improve the video
25
25
 
26
26
  ## VidScript Language
27
27
 
@@ -118,35 +118,40 @@ input hero = "{{hero_clip}}"
118
118
 
119
119
  1. **Understand the goal** — What video does the user want? (promo, testimonial, meme, etc.)
120
120
  2. **Plan the structure** — Time blocks, durations, inputs
121
- 3. **Gather assets** — Video URLs, local paths, or captured website screenshots/images; when the user provides a URL, use browser screenshot capability if available
122
- 4. **Probe asset dimensions** — Inspect every local or downloaded image/video before placing it. Use `ffprobe` for video/audio, `sips -g pixelWidth -g pixelHeight file` on macOS images, or `identify file` when ImageMagick is available.
123
- 5. **Compose VidScript** — Write the full script using measured dimensions, aspect-ratio-preserving scale, centered/safe-area placement, and entry/exit animation beats for each visual asset
124
- 6. **Validate** — Run `scenerok validate script.vid`
125
- 7. **Render** — Run `scenerok render script.vid --watch`
126
- 8. **Deliver** — Share the download URL when complete
121
+ 3. **Gather assets** — Video URLs, local paths, generated clips, and selective website screenshots/images; when the user provides a URL, browser screenshots are optional source material, not a requirement to build the whole video from screenshots
122
+ 4. **Probe asset type and dimensions** — Inspect every local or downloaded asset before placing it. Use `file` to identify image vs video, `ffprobe` for video/audio, `sips -g pixelWidth -g pixelHeight file` on macOS images, or `identify file` when ImageMagick is available.
123
+ 5. **Prepare still images for render** — Plain VidScript input declarations can upload images, but production-safe timeline placement currently uses `video` clips. Do not write `[time] = video some_png_input` directly. Convert still images/screenshots/logos to short H.264 MP4 plates first, then use those MP4s as `video` inputs. Plugin-generated image surfaces are separate and only safe when a plugin explicitly returns a `surface` result.
124
+ 6. **Compose VidScript** — Write the full script using measured dimensions, aspect-ratio-preserving scale, centered/safe-area placement, and entry/exit animation beats for each visual asset
125
+ 7. **Validate** — Run `scenerok validate script.vid`
126
+ 8. **Render** — Run `scenerok render script.vid --watch`
127
+ 9. **Deliver** — Share the download URL when complete
127
128
 
128
129
  ## Website URL Asset Workflow
129
130
 
130
- When the user provides a product, company, landing page, app store listing, or ecommerce URL, treat the site as the primary visual source for the video.
131
+ When the user provides a product, company, landing page, app store listing, or ecommerce URL, treat the site as a useful source of truth for claims, brand cues, and optional visual assets.
131
132
 
132
133
  1. Visit the URL with an available browser tool before composing the final VidScript.
133
- 2. Capture screenshots of useful page states: hero section, product detail, pricing or offer section, testimonials, checkout/app preview, and any visual proof points.
134
- 3. Detect and save usable product images, app screenshots, brand marks, and logos from the page when available. Prefer direct image assets only when they are accessible and clearly match the product; otherwise use browser screenshots.
135
- 4. Probe saved asset dimensions before composing. Record width, height, aspect ratio, and duration if video; never guess dimensions from filenames or screenshots.
136
- 5. Use those screenshots or images as VidScript input assets and build the ad from real product visuals. Crop, resize, overlay text, add filters, and sequence them into a promo.
137
- 6. Scale assets from their measured aspect ratio to fit the output frame and safe areas. For 1080x1920 ads, keep primary visuals inside roughly 80-88% of frame width and reserve enough top/bottom room for text.
138
- 7. Animate visual assets intentionally: use entry animations such as `motion.popIn`, `motion.riseIn`, or `motion.slideY`; use short exit fade/slide segments when a clean transition is needed. For static plates, split the asset into a main segment and a final 0.4-0.7s segment with `motion.fadeOut(...)` to create a real exit.
139
- 8. Use generative visuals such as `xai.imagine` only as a fallback, background texture, transition, or supplemental shot when real website assets are missing, low quality, or the user explicitly asks for generated footage.
140
- 9. For every generated visual prompt, explicitly require a clean scene with no text, no words, no letters, no captions, no logos, no watermarks, and no readable UI copy. AI video generation often corrupts text; all final titles, offers, captions, CTAs, prices, and labels must be created with VidScript `text` primitives.
141
- 10. Do not invent product claims. Extract copy, pricing, feature names, social proof, and CTA language from the site or ask the user.
134
+ 2. Capture website screenshots sparingly and intentionally. Prefer above-the-fold screenshots, product/app closeups, visible hero sections, pricing/offer cards, testimonials, and other viewable page states. Avoid full-page or very long screenshots as backgrounds because their text becomes illegible in video.
135
+ 3. Crop or frame screenshots to the specific visual evidence needed. Do not use screenshots as the only visual style by default, and do not build a whole video as a sequence of unreadable full-page captures.
136
+ 4. Detect and save usable product images, app screenshots, brand marks, and logos from the page when available. Prefer direct image assets when they are accessible and clearly match the product; use browser screenshots as supporting assets when they communicate something visible at video size.
137
+ 5. Probe saved asset type and dimensions before composing. Record file type, width, height, aspect ratio, and duration if video; never guess dimensions from filenames or screenshots.
138
+ 6. If a saved asset is a still image (`png`, `jpg`, `webp`, `gif`, `avif`, `svg`) and you want to place it on the timeline, convert it to a short H.264 MP4 plate with matching duration before using it as a `video` input. Keep the original still file too, but do not place it as `video`.
139
+ 7. Use captured assets or converted plates as grounded proof points alongside other visual material. A good ad can combine product/logo/app screenshots, generated video clips, motion backgrounds, text primitives, and music.
140
+ 8. Scale assets from their measured aspect ratio to fit the output frame and safe areas. For 1080x1920 ads, keep primary visuals inside roughly 80-88% of frame width and reserve enough top/bottom room for text.
141
+ 9. Animate visual assets intentionally: use entry animations such as `motion.popIn`, `motion.riseIn`, or `motion.slideY`; use short exit fade/slide segments when a clean transition is needed. For static plates, split the asset into a main segment and a final 0.4-0.7s segment with `motion.fadeOut(...)` to create a real exit.
142
+ 10. Generative visuals such as `xai.imagine` are allowed and often useful for product context, lifestyle scenes, abstract transitions, or polished motion. Use website screenshots as optional evidence, not as a rule that forbids generative clips.
143
+ 11. For every generated visual prompt, explicitly require a clean scene with no text, no words, no letters, no captions, no logos, no watermarks, and no readable UI copy. AI video generation often corrupts text; all final titles, offers, captions, CTAs, prices, and labels must be created with VidScript `text` primitives.
144
+ 12. Do not invent product claims. Extract copy, pricing, feature names, social proof, and CTA language from the site or ask the user.
142
145
 
143
146
  ## Best Practices
144
147
 
145
148
  - **Use dynamic timeblocks** — `[-]` auto-advances the cursor, reducing calculation errors
146
149
  - **Use `prev` for offsets** — `[prev + 0.5s .. prev + 2s]` for gaps between content
147
150
  - **Named arguments for clarity** — `hero.Trim(start: 0s, end: 5s)` over `hero.Trim(0s, 5s)`
148
- - **Prefer real website assets for URL-based ads** — browser screenshots, product images, app screenshots, and logos should come before fully generative clips
151
+ - **Use website assets judiciously for URL-based ads** — browser screenshots, product images, app screenshots, and logos can ground the ad, but they are optional ingredients, not the whole recipe
152
+ - **Avoid full-page screenshot backgrounds** — use above-fold, cropped, or focused screenshots that remain legible at video size; never rely on long website screenshots where the text becomes unreadable
149
153
  - **Probe before placing** — get exact dimensions for every real asset, calculate scale from aspect ratio, and set `width`, `height`, `x`, and `y` explicitly
154
+ - **Match asset type to placement** — use real video files for `video` clips. Convert still screenshots/images to MP4 plates before timeline placement; do not point `video` at `.png`, `.jpg`, or `.webp` files.
150
155
  - **Animate assets, not only text** — give product screenshots/logos/cards a clear entrance and a clean exit; split static visuals into main and fade-out segments if needed
151
156
  - **Keep generated media text-free** — prompts for AI-generated images/videos must explicitly say no text, no words, no letters, no captions, no logos, no watermarks, and no readable UI copy; add all final copy with VidScript `text` primitives
152
157
  - Use 1080x1920 for vertical content (TikTok/Instagram)
@@ -188,7 +193,7 @@ If a render fails:
188
193
  Always ask clarifying questions about:
189
194
  - Target platform (TikTok, Instagram, YouTube, etc.)
190
195
  - Brand colors and fonts
191
- - Existing assets, product URLs, browser-captured screenshots, logos, and page images
196
+ - Existing assets, product URLs, optional browser-captured screenshots, logos, page images, and generative visual directions
192
197
  - Desired duration and style
193
198
  - Whether they want to start from a template or from scratch
194
199
 
@@ -8,11 +8,12 @@ VidScript is a declarative DSL for composing short-form videos. Write a script,
8
8
  2. **Validate early** — `scenerok validate file.vid` (errors show as `Line N:C — message`).
9
9
  3. **Music package syntax** — `import eleven from "@elevenlabs/music"`, then call `eleven.music(...)`, `eleven.generateMusic(...)`, or `eleven.composeMusic(...)`.
10
10
  4. **Copy working examples** — see `examples/system/` in this skill folder (e.g. `minimal-text-reel.vid`, `ecom-product-launch.vid`, `rokmilk-chocolate-promo.vid`).
11
- 5. **URL assets first** — if the user gives a website or product URL and browser screenshot tools are available, capture real page screenshots, product images, app screenshots, and logos for VidScript `input` assets before using generated visuals.
12
- 6. **Probe asset dimensions** — before placing real assets, inspect width, height, and duration with `ffprobe`, `sips`, or `identify`; then scale from the measured aspect ratio and set explicit `width`, `height`, `x`, and `y`.
13
- 7. **Animate real assets** — visual inputs should have entry motion and, where transitions matter, an exit beat. For static plates, split into a main segment and a short final `motion.fadeOut(...)` segment.
14
- 8. **Generated media has no text** — prompts for AI-generated images/videos must explicitly ask for no text, no words, no letters, no captions, no logos, no watermarks, and no readable UI copy. Use VidScript `text` primitives for all final copy.
15
- 9. **Strict rules** — see `vidscript-strict.md` for invalid patterns (bare `xai.imagine`, bare `eleven.music(...)` without import, trailing params after direct plugin calls, nested quotes in prompts).
11
+ 5. **Use URL assets selectively** — if the user gives a website or product URL and browser screenshot tools are available, you can capture above-fold screenshots, product images, app screenshots, and logos for VidScript `input` assets. Do not treat screenshots as mandatory or as the only visual source.
12
+ 6. **Probe asset type and dimensions** — before placing real assets, identify image vs video with `file`; inspect width, height, and duration with `ffprobe`, `sips`, or `identify`; then scale from the measured aspect ratio and set explicit `width`, `height`, `x`, and `y`.
13
+ 7. **Use still images correctly** — local image inputs can be uploaded, but plain VidScript timeline placement currently uses `video` clips. Convert screenshots/logos/stills to short H.264 MP4 plates before writing `[time] = video asset`.
14
+ 8. **Animate real assets** — visual inputs should have entry motion and, where transitions matter, an exit beat. For static plates, split into a main segment and a short final `motion.fadeOut(...)` segment.
15
+ 9. **Generated media has no text** — prompts for AI-generated images/videos must explicitly ask for no text, no words, no letters, no captions, no logos, no watermarks, and no readable UI copy. Use VidScript `text` primitives for all final copy.
16
+ 10. **Strict rules** — see `vidscript-strict.md` for invalid patterns (bare `xai.imagine`, bare `eleven.music(...)` without import, trailing params after direct plugin calls, nested quotes in prompts).
16
17
 
17
18
  ### Common validation failures
18
19
 
@@ -45,16 +46,34 @@ input logo = "./assets/logo.png"
45
46
 
46
47
  Supports HTTP(S) URLs, `/uploads/` paths, and local paths for video/image assets.
47
48
 
48
- For ads based on a supplied URL, use browser tools to capture real website screenshots and save product images or logos when available. Declare those files as inputs and use them in the composition before reaching for generated clips.
49
+ For ads based on a supplied URL, browser tools can capture real website visuals and save product images or logos when useful. Use screenshots sparingly: favor above-fold, cropped, or focused captures that remain legible at video size. Avoid full-page or long scrolling screenshots as backgrounds because their text will usually become unreadable.
49
50
 
50
- After declaring or downloading assets, probe dimensions before placing them:
51
+ Website captures are optional grounding assets, not a ban on generated video. A strong ad may combine product/logo/app screenshots with generated lifestyle clips, motion backgrounds, transitions, and VidScript text primitives. Use generative clips when they improve storytelling, show context, or make the ad feel more polished.
52
+
53
+ After declaring or downloading assets, probe file type and dimensions before placing them:
51
54
 
52
55
  ```bash
56
+ file asset.png
53
57
  ffprobe -v error -select_streams v:0 -show_entries stream=width,height,duration -of csv=p=0 asset.mp4
54
58
  sips -g pixelWidth -g pixelHeight asset.png
55
59
  identify asset.webp
56
60
  ```
57
61
 
62
+ Use real videos as `video` clips. For still images such as screenshots, logos, product photos, or PNG captures, create a video plate first:
63
+
64
+ ```bash
65
+ ffmpeg -y -loop 1 -t 4.5 -i screenshot.png -vf "scale=ceil(iw/2)*2:ceil(ih/2)*2,format=yuv420p" -r 30 screenshot-plate.mp4
66
+ ```
67
+
68
+ Then declare the plate:
69
+
70
+ ```vidscript
71
+ input screenshot = "assets/video-plates/screenshot-plate.mp4"
72
+ [0s .. 4.5s] = video screenshot, width: 1080, height: 1920, x: 0, y: 0, animate: motion.fadeIn(0.35s)
73
+ ```
74
+
75
+ Do not write `[0s .. 4s] = video screenshot_png` when `screenshot_png` points to `.png`, `.jpg`, or `.webp`. Plugin-generated image surfaces are separate from local still inputs and are only safe when a plugin explicitly returns a `surface` result.
76
+
58
77
  Use the measured width and height to preserve aspect ratio. For a 1080x1920 vertical ad, a common centered fit is:
59
78
 
60
79
  ```text
@@ -14,22 +14,40 @@ Without the owning import, calls like `xai.imagine`, `xai.tts`, `eleven.music`,
14
14
 
15
15
  **Never** call package functions without importing the package alias first.
16
16
 
17
- ## Rule 2: Prefer real URL assets before generative media
17
+ ## Rule 2: Use URL assets selectively
18
18
 
19
- When the user gives a product, company, ecommerce, landing page, or app URL and you have browser/screenshot capability, visit the page and capture useful screenshots first. Save hero sections, product views, app screenshots, pricing/offer sections, testimonials, and detectable logos or brand marks as local image assets. Use those assets in VidScript `input` declarations and compose the ad around real product visuals.
19
+ When the user gives a product, company, ecommerce, landing page, or app URL and you have browser/screenshot capability, you can visit the page and capture useful visual evidence. Favor above-fold screenshots, product views, app screenshots, pricing/offer cards, testimonials, and detectable logos or brand marks.
20
20
 
21
- Only use `xai.imagine` or other fully generative visuals when site assets are unavailable, too low quality, needed as background/transition material, or explicitly requested by the user. Do not invent claims; use website copy or ask the user.
21
+ Do not use full-page or very long website screenshots as video backgrounds; their text is usually illegible at output size. Crop or frame screenshots to the specific visual proof point, and use them sparingly.
22
22
 
23
- ## Rule 2.5: Probe real asset dimensions before placement
23
+ Website screenshots are optional source material, not a requirement to build the whole video from screenshots. You may combine them with `xai.imagine` or other generative video plugins for lifestyle scenes, motion backgrounds, transitions, or product context. Do not invent claims; use website copy or ask the user.
24
24
 
25
- Before writing `width`, `height`, `x`, or `y` for a real image/video/screenshot/logo, inspect the actual asset dimensions and duration:
25
+ ## Rule 2.5: Probe real asset type and dimensions before placement
26
+
27
+ Before writing `width`, `height`, `x`, or `y` for a real image/video/screenshot/logo, identify the asset type and inspect the actual dimensions and duration:
26
28
 
27
29
  ```bash
30
+ file asset.png
28
31
  ffprobe -v error -select_streams v:0 -show_entries stream=width,height,duration -of csv=p=0 asset.mp4
29
32
  sips -g pixelWidth -g pixelHeight asset.png
30
33
  identify asset.webp
31
34
  ```
32
35
 
36
+ Use real video files for `video` clips. Plain VidScript input declarations can upload still images, but direct timeline placement currently does not expose a production-safe `[time] = image input` syntax. Do not point `video` at `.png`, `.jpg`, or `.webp` inputs.
37
+
38
+ For screenshots, logos, and other stills, convert the still to an H.264 MP4 plate first:
39
+
40
+ ```bash
41
+ ffmpeg -y -loop 1 -t 4 -i screenshot.png -vf "scale=ceil(iw/2)*2:ceil(ih/2)*2,format=yuv420p" -r 30 screenshot-plate.mp4
42
+ ```
43
+
44
+ Then use the plate:
45
+
46
+ ```vidscript
47
+ input screenshot = "assets/video-plates/screenshot-plate.mp4"
48
+ [0s .. 4s] = video screenshot, width: 1080, height: 1920, x: 0, y: 0, animate: motion.fadeIn(0.35s)
49
+ ```
50
+
33
51
  Use measured aspect ratio and safe-area math. Do not guess sizes or blindly resize every asset to the same box. Center visuals with explicit pixel placement when possible:
34
52
 
35
53
  ```text