cueframe 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,275 @@
1
+ ---
2
+ name: cueframe-product-video
3
+ description: Use when turning a screen recording (with cursor/click events) into a polished, auto-zooming product video or demo reel via CueFrame — e.g. "make a product demo from this Browserbase capture", "auto-zoom this screen recording to where the user clicked", "turn this walkthrough into a social demo". This is a USE-CASE recipe over the CueFrame primitives in the `cueframe-cli` skill.
4
+ ---
5
+
6
+ # CueFrame — Product Video recipe
7
+
8
+ A **use-case recipe**, not new framework surface. CueFrame's engine only knows
9
+ primitives (`upload` → `composition put` → `render`, the `reframe` viewport, the
10
+ overlay/effect primitives). This skill is the *judgment* on top: how to turn a
11
+ captured walkthrough into a product video that looks good. Drive the primitives
12
+ with the `cueframe-cli` skill; this tells you *what* to author.
13
+
14
+ ## The pipeline
15
+
16
+ ```
17
+ capture (video + cursor/click points) → auto-zoom (points → reframe)
18
+ → compose (video track + per-clip reframe) → polish (overlay primitives)
19
+ → render
20
+ ```
21
+
22
+ ## 1. Capture → timed focus points
23
+
24
+ Get a screen recording **plus** the cursor/click events with viewport
25
+ coordinates and timestamps (e.g. from a browser-automation capture). Normalize
26
+ every point to `{ t: seconds, x: 0..1, y: 0..1 }` against the **real captured
27
+ viewport** — these are the *focus points* the zoom follows. Measure the rendered
28
+ video's true duration (ffprobe) and map point times onto it.
29
+
30
+ ## 2. Points → reframe (auto-zoom)
31
+
32
+ The framework owns the tuning. Canonical implementation:
33
+ `autoZoom(points, style)` in `@cueframe/composition` (timed focus points →
34
+ gapless `reframe` segments; never re-derive this with an LLM — it drifts). Pick a
35
+ **style**, never raw numbers:
36
+
37
+ | style | zoom (`<1` punches in) | feel |
38
+ |---|---|---|
39
+ | `subtle` | 0.80 | gentle, corporate |
40
+ | `standard` | 0.67 | default |
41
+ | `punchy` | 0.55 | snappy, social |
42
+
43
+ The tiling rule (what `autoZoom` does — use this until it's exposed as
44
+ `cueframe compose from-points`): sort points by `t`; group points within
45
+ `mergeGap`s; pad each group `prePad` before / `postPad` after; drop groups
46
+ shorter than `minZoom`; merge overlapping windows; emit `frame-center` zoom 1
47
+ in the gaps and a `point` punch-in (`focus:{mode:"point",x,y}`, `zoom` = the
48
+ style's) over each window. Result → `source.reframe.segments` on the clip.
49
+
50
+ ## 3. Compose
51
+
52
+ One video track, the screen recording as a media clip carrying the reframe.
53
+ `cueframe composition put <projectId> -b @composition.json` (see `cueframe-cli`).
54
+ Choose `-a 16:9` for landscape product video, `9:16` for social.
55
+
56
+ ## 4. Polish — layer overlay primitives (this is what makes it look "produced")
57
+
58
+ A bare auto-zoom reads as a raw screen-grab. Add overlay/effect clips referencing
59
+ registered primitives (`registry/params-required.json` lists each one's params;
60
+ `$brand:` tokens bind brand colors/fonts at render). Tasteful default set for a
61
+ product demo:
62
+
63
+ | Goal | primitiveId | kind |
64
+ |---|---|---|
65
+ | animated cursor tracing the click path | `cursor-flow` / `simulated-cursor` | overlay |
66
+ | title / context | `lower-third` or `heroText` | overlay |
67
+ | highlight a clicked element | `marker-highlight` / `pulsing-indicator` | overlay |
68
+ | device/browser frame | `device-mockup-zoom` / `browser-flow` | overlay |
69
+ | captions (if narrated) | the top-level `captions` field | — |
70
+ | finish/look | `color-grade` + `vignette` | effect |
71
+
72
+ A primitive is the `source` of a clip; the clip wraps it with `id`/`startTime`/`duration`:
73
+ - Overlay clip (on a `kind:"overlay"` track): `{ "id":"t1", "startTime":1, "duration":3, "source":{ "kind":"overlay", "primitiveId":"lower-third", "params":{…}, "zPlane":"front" } }`.
74
+ - Effect clip (on a `kind:"effect"` track): `{ "id":"e1", "startTime":0, "duration":9, "source":{ "kind":"effect", "primitiveId":"color-grade", "params":{…} } }`.
75
+ Simultaneous overlays go on **separate tracks** — clips on one track can't overlap in time.
76
+ **Don't over-decorate** — 2–4 primitives. Cursor + title + grade is usually enough.
77
+
78
+ ## 5. Render
79
+
80
+ `cueframe render <projectId> -o out.mp4 --json`. Verify the output (ffprobe +
81
+ sample frames) before declaring done.
82
+
83
+ ## Worked examples — composition JSON
84
+
85
+ These are **orientation, not the contract.** The live wire schema is
86
+ `cueframe schema composition`, and the way to know a draft is correct is
87
+ `cueframe composition validate -b @composition.json` — it runs the same
88
+ validator the server enforces, with no save and no render. Discover + validate
89
+ before you render; never spend a render to find a schema mistake.
90
+
91
+ Each block below is a complete `@composition.json` (the bare object `composition put`
92
+ accepts: top-level `v`, `format`, `tracks`, optional `captions`). All five validate
93
+ against the real wire schema + save invariants (swap the `<…MediaId>` placeholders for
94
+ real `cueframe upload` ids). Conventions that keep them valid:
95
+
96
+ - Tracks hold **`contents`** (not `clips`); each clip wraps a `source`.
97
+ - A clip's `source.kind` must match the track `kind`: `media` → `video`/`image`/`audio`
98
+ tracks, `overlay` → `overlay` tracks, `effect` → `effect` tracks.
99
+ - Clip/segment times are **seconds** (`startTime`, `duration`, `startSec`, `endSec`,
100
+ `trim`); caption word times are **milliseconds** (`startMs`/`endMs`).
101
+ - `region`/`fit` are **clip-level** (siblings of `source`); `reframe` lives on the media `source`.
102
+ - Clips on one track can't overlap — put simultaneous layers on **separate tracks**.
103
+ - `zoom < 1` punches in (range `[0.1, 1]`, default `1`). Durations are illustrative — match your source.
104
+
105
+ ### EX1 · Auto-zoom screen demo (16:9)
106
+ One screen recording with `reframe` punch-ins (the `autoZoom` output: `frame-center` in the gaps, `point` over each click) + a cinematic grade.
107
+
108
+ ```json
109
+ {
110
+ "v": 1,
111
+ "format": { "aspectRatio": "16:9", "fps": 30 },
112
+ "tracks": [
113
+ {
114
+ "id": "v-screen", "kind": "video", "contents": [
115
+ {
116
+ "id": "rec", "startTime": 0, "duration": 12,
117
+ "source": {
118
+ "kind": "media", "mediaId": "<mediaId>",
119
+ "reframe": { "segments": [
120
+ { "startSec": 0, "endSec": 2, "focus": { "mode": "frame-center" }, "zoom": 1 },
121
+ { "startSec": 2, "endSec": 5, "focus": { "mode": "point", "x": 0.28, "y": 0.42 }, "zoom": 0.6, "ease": { "in": 0.4, "out": 0.4 } },
122
+ { "startSec": 5, "endSec": 7, "focus": { "mode": "frame-center" }, "zoom": 1, "ease": { "in": 0.4, "out": 0.4 } },
123
+ { "startSec": 7, "endSec": 10, "focus": { "mode": "point", "x": 0.72, "y": 0.66 }, "zoom": 0.55, "ease": { "in": 0.4, "out": 0.4 } },
124
+ { "startSec": 10, "endSec": 12, "focus": { "mode": "frame-center" }, "zoom": 1, "ease": { "in": 0.4, "out": 0.4 } }
125
+ ] }
126
+ }
127
+ }
128
+ ]
129
+ },
130
+ {
131
+ "id": "fx", "kind": "effect", "contents": [
132
+ { "id": "grade", "startTime": 0, "duration": 12, "source": { "kind": "effect", "primitiveId": "color-grade", "params": { "preset": "cinematic" } } }
133
+ ]
134
+ }
135
+ ]
136
+ }
137
+ ```
138
+
139
+ ### EX2 · Social vertical with captions (9:16)
140
+ Reframe + a `lower-third` title, then a `cursor-flow` overlay (non-overlapping on one overlay track), transcript `captions` (ms timing), warm grade. `$brand:` refs bind brand colors/fonts at render.
141
+
142
+ ```json
143
+ {
144
+ "v": 1,
145
+ "format": { "aspectRatio": "9:16", "fps": 30, "platform": "tiktok" },
146
+ "tracks": [
147
+ {
148
+ "id": "v", "kind": "video", "contents": [
149
+ {
150
+ "id": "rec", "startTime": 0, "duration": 10,
151
+ "source": {
152
+ "kind": "media", "mediaId": "<mediaId>",
153
+ "reframe": { "segments": [
154
+ { "startSec": 0, "endSec": 4, "focus": { "mode": "point", "x": 0.3, "y": 0.4 }, "zoom": 0.7, "ease": { "in": 0.4, "out": 0.4 } },
155
+ { "startSec": 4, "endSec": 10, "focus": { "mode": "point", "x": 0.62, "y": 0.55 }, "zoom": 0.62, "ease": { "in": 0.4, "out": 0.4 } }
156
+ ] }
157
+ }
158
+ }
159
+ ]
160
+ },
161
+ {
162
+ "id": "ov", "kind": "overlay", "contents": [
163
+ {
164
+ "id": "title", "startTime": 0, "duration": 3.5,
165
+ "source": {
166
+ "kind": "overlay", "primitiveId": "lower-third",
167
+ "params": {
168
+ "text": "AI video, one command", "subtitle": "cueframe", "variant": "modern",
169
+ "tokens": { "textColor": "$brand:colors.textOnMedia", "accentColor": "$brand:colors.accent", "fontFamily": "$brand:fonts.heading.family" }
170
+ }
171
+ }
172
+ },
173
+ {
174
+ "id": "cursor", "startTime": 3.5, "duration": 6,
175
+ "source": {
176
+ "kind": "overlay", "primitiveId": "cursor-flow",
177
+ "params": {
178
+ "waypoints": [ { "x": 200, "y": 180 }, { "x": 540, "y": 240, "click": true, "label": "Generate" }, { "x": 1040, "y": 520, "click": true, "label": "Publish" } ],
179
+ "cursorColor": "$brand:colors.text", "showTargets": true
180
+ }
181
+ }
182
+ }
183
+ ]
184
+ },
185
+ {
186
+ "id": "fx", "kind": "effect", "contents": [
187
+ { "id": "grade", "startTime": 0, "duration": 10, "source": { "kind": "effect", "primitiveId": "color-grade", "params": { "preset": "warm" } } }
188
+ ]
189
+ }
190
+ ],
191
+ "captions": {
192
+ "segments": [
193
+ { "words": [
194
+ { "text": "Watch", "startMs": 0, "endMs": 320 },
195
+ { "text": "this", "startMs": 320, "endMs": 560, "emphasis": true },
196
+ { "text": "render", "startMs": 560, "endMs": 980 },
197
+ { "text": "in", "startMs": 980, "endMs": 1120 },
198
+ { "text": "one", "startMs": 1120, "endMs": 1360, "emphasis": true },
199
+ { "text": "command", "startMs": 1360, "endMs": 1900 }
200
+ ] }
201
+ ],
202
+ "style": { "fontFamily": "Inter", "fontSize": 6.5, "fontWeight": 900, "color": "#ffffff", "highlightColor": "#10B981", "position": "bottom", "textTransform": "uppercase" }
203
+ }
204
+ }
205
+ ```
206
+
207
+ ### EX3 · Picture-in-picture (16:9)
208
+ Full-frame screen recording + a webcam **inset** placed with `region`/`fit` on a second video track (composites on top by track order). Two uploads → two `mediaId`s.
209
+
210
+ ```json
211
+ {
212
+ "v": 1,
213
+ "format": { "aspectRatio": "16:9", "fps": 30 },
214
+ "tracks": [
215
+ { "id": "v-screen", "kind": "video", "contents": [
216
+ { "id": "screen", "startTime": 0, "duration": 10, "source": { "kind": "media", "mediaId": "<screenMediaId>" } }
217
+ ] },
218
+ { "id": "v-cam", "kind": "video", "contents": [
219
+ { "id": "cam", "startTime": 0, "duration": 10, "source": { "kind": "media", "mediaId": "<webcamMediaId>" }, "region": { "x": 0.68, "y": 0.04, "w": 0.3, "h": 0.3 }, "fit": "cover" }
220
+ ] }
221
+ ]
222
+ }
223
+ ```
224
+
225
+ ### EX4 · Split-screen (16:9)
226
+ Two sources side-by-side, each with a half-frame `region` on its own video track.
227
+
228
+ ```json
229
+ {
230
+ "v": 1,
231
+ "format": { "aspectRatio": "16:9", "fps": 30 },
232
+ "tracks": [
233
+ { "id": "v-left", "kind": "video", "contents": [
234
+ { "id": "left", "startTime": 0, "duration": 8, "source": { "kind": "media", "mediaId": "<leftMediaId>" }, "region": { "x": 0, "y": 0, "w": 0.5, "h": 1 }, "fit": "cover" }
235
+ ] },
236
+ { "id": "v-right", "kind": "video", "contents": [
237
+ { "id": "right", "startTime": 0, "duration": 8, "source": { "kind": "media", "mediaId": "<rightMediaId>" }, "region": { "x": 0.5, "y": 0, "w": 0.5, "h": 1 }, "fit": "cover" }
238
+ ] }
239
+ ]
240
+ }
241
+ ```
242
+
243
+ ### EX5 · Presenter intro — title behind the speaker (9:16)
244
+ Talking-head with an `active-speaker` reframe and a `heroText` title at `zPlane:"behind-subject"`. The speaker **detection** and the person **matte** are both resolved automatically at render (see the `cueframe-cli` skill's "Render resolves expensive intents" section) — author the intent, render, done.
245
+
246
+ ```json
247
+ {
248
+ "v": 1,
249
+ "format": { "aspectRatio": "9:16", "fps": 30 },
250
+ "tracks": [
251
+ { "id": "v", "kind": "video", "contents": [
252
+ { "id": "cam", "startTime": 0, "duration": 6, "source": { "kind": "media", "mediaId": "<mediaId>", "reframe": { "segments": [ { "startSec": 0, "endSec": 6, "focus": { "mode": "active-speaker" }, "zoom": 0.8 } ] } } }
253
+ ] },
254
+ { "id": "ov", "kind": "overlay", "contents": [
255
+ { "id": "hero", "startTime": 0.5, "duration": 5, "source": {
256
+ "kind": "overlay", "primitiveId": "heroText",
257
+ "params": { "text": "SHIP IN MINUTES", "fontSizePct": 14, "textColor": "$brand:colors.textOnMedia", "fontFamily": "$brand:fonts.heading.family" },
258
+ "zPlane": "behind-subject", "anchor": { "space": "scene", "x": 0.5, "y": 0.42 }
259
+ } }
260
+ ] }
261
+ ]
262
+ }
263
+ ```
264
+
265
+ ## Notes / honesty
266
+
267
+ - **Picture-in-picture / inset cards render** — per-clip output-space placement
268
+ (`region:{x,y,w,h}` in `[0,1]` + `fit`) is wired for PiP / inset / split-screen,
269
+ for video, image, and overlay clips. Set `region`/`fit` on the **clip** (siblings
270
+ of `source`, not inside it — or author via the `set_clip_region` agent tool); two
271
+ clips with different placement composite by track order instead of occluding the
272
+ background. See EX3 / EX4 below.
273
+ - This recipe lives in a skill on purpose: the engine stays use-case-free; the
274
+ product opinions (style, which primitives, layout) live here and ship via
275
+ `cueframe install`. Add sibling recipes (`cueframe-podcast-clip`, …) the same way.